Administration toolsCommands

How to Check SSD/HDD health in Linux

Check SSd/HDD helth using smartmontools on linux

Smartmontools (S.M.A.R.T. Monitoring Tools) is a set of utility programs (smartctl and smartd) to control and monitor computer storage systems using the Self-Monitoring, Analysis and Reporting Technology (S.M.A.R.T.) system built into most modern (P)ATA, Serial ATA, SCSI/SAS and NVMe hard drives.

Smartmontools displays early warning signs of hard drive problems detected by S.M.A.R.T., often giving notice of impending failure while it is still possible to back data up.
In this post, we will show you how to check SSD and HDD health on Linux.

Install Smartctl

By default, Smartctl is included in the default repository of all major Linux distributions.
For Debian and Ubuntu distribution, install Smartctl using the following command:

#sudo apt-get install smartmontools -y

For RHEL, CentOS, and Fedora distribution, install Smartctl using the following command:

sudo dnf install smartmontools

[ads1]
After installing Smartctl, start the Smartctl service using the following command:

# sudo systemctl start smartd

You can check the status of the smartd with the following command:

systemctl status smartd

You should get the following output:

# systemctl status smartd
● smartmontools.service - Self Monitoring and Reporting Technology (SMART) Daem>
     Loaded: loaded (/lib/systemd/system/smartmontools.service; enabled; vendor>
     Active: active (running) since Thu 2021-08-12 08:06:42 CEST; 22s ago
       Docs: man:smartd(8)
             man:smartd.conf(5)
   Main PID: 71321 (smartd)
     Status: "Next check of 1 device will start at 08:36:42"
      Tasks: 1 (limit: 9278)
     Memory: 1.7M
     CGroup: /system.slice/smartmontools.service
             └─71321 /usr/sbin/smartd -n

Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], opened
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], PNY CS900 120GB >
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], not found in sma>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], can't monitor Cu>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], can't monitor Of>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], is SMART capable>
Aug 12 08:06:42 Gandalf smartd[71321]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 0 >
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], previous self-te>
Aug 12 08:06:42 Gandalf smartd[71321]: Device: /dev/sda [SAT], state written to>
Aug 12 08:06:42 Gandalf systemd[1]: Started Self Monitoring and Reporting Techn>

Test Health of SSD/HDD

To test overall-health of the drive, type:

# sudo smartctl -d ata -H /dev/sda

Where,
d – Specifies the type of device.
ata – the device type is ATA, use scsi for SCSI device type.
H – Check the device to report its SMART health status.

# sudo smartctl -d ata -H /dev/sda
[sudo] password for rasho:             
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-80-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

The result PASSED indicates that the disk drive is good. If the device reports failing health status, this means either that the device has already failed or could fail very soon.
If it indicates failing use -a option to get more information.

# sudo smartctl -a /dev/sda

Example output:

# sudo smartctl -a /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-80-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     PNY CS900 120GB SSD
Serial Number:    PNY07190003520101427
LU WWN Device Id: 5 f8db4c 071901427
Firmware Version: CS900612
User Capacity:    120,034,123,776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Aug 12 08:26:38 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (  32)	The self-test routine was interrupted
					by the host with a hard or soft reset.
Total time to complete Offline 
data collection: 		(65535) seconds.
Offline data collection
capabilities: 			 (0x79) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  30) minutes.
Conveyance self-test routine
recommended polling time: 	 (   6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       5116
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       3573
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
170 Unknown_Attribute       0x0003   093   093   000    Pre-fail  Always       -       65
173 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       10420411
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       225
194 Temperature_Celsius     0x0023   067   067   000    Pre-fail  Always       -       33 (Min/Max 33/33)
218 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -       0
231 Temperature_Celsius     0x0013   100   100   000    Pre-fail  Always       -       94
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       12374

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      00%      1822         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[ads1]
You can monitor the following attributes:
[ID 5] Reallocated Sectors Count – Numbers of sectors reallocated due to read errors.
[ID 187] Reported Uncorrect – Number of uncorrectable errors while accessing read/write to sector.
[ID 230] Media Wearout Indicator – Current state of drive operation based upon the Life Curve.
100 is the BEST value and 0 is the WORST.

Check SMART Attribute Details for more information.
To initiate the extended test (long) using the following command:

# sudo smartctl -t long /dev/sda

To perform a self test, run:

# sudo smartctl -t short /dev/sda

To find drive’s self test result, use the following command.

# sudo smartctl -l selftest /dev/sda

To evaluate estimate time to perform test, run the following command.

# sudo smartctl -c /dev/sda

You can print error logs of the disk by using the command:

# sudo smartctl -l error /dev/sda

To get help information, run the following command:

# sudo smartctl --help

Conclusion

[ads1]
In the above guide, you learned how to install and use the S.M.A.R.T tool to check the health of your SSH and HDD drives. I hope this will help you a lot. For more information, read the smartctl man page.

See also: GDU fast console disk usage analyzer

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA


This site uses Akismet to reduce spam. Learn how your comment data is processed.

Check Also
Close
Back to top button