+
Skip to content

Conversation

delboy711
Copy link

@delboy711 delboy711 commented Aug 25, 2025

Changes to src/rockstor/system/smart.py to parse smartctl output for nvme drives to populate the existing s.m.a.r.t. pages with nvme data wherever possible.

Fixes #3013

It should be possible to start self tests, and display test logs in the same way as Sata drives.
I have not tested with error logs since I do not have any drives with errors. If anyone can supply error log output for an nvme drive that would be useful.

nvme_smart-01 nvme_smart-02 nvme_smart-03 nvme_smart-04 nvme_smart-05

@phillxnet
Copy link
Member

@delboy711 This is nice, and a very welcome addition. Thanks. It's been bugging me for some time - but alas I have no current nvme access to develop against - did have but alas a move/emigration forced some significant downsizing.

I've edited the PR text a tad re adding the issue counterpart you created, thanks. This is required so we can tie them together and helps with attribution. Marking for review from someone with access to a real nvme hopefully :) .

Thanks also for the detailed issue & PR (reviewers should reference the issue also). If no core developer has nvme to hand on a test machine then no-worries, we are just about to embark on a fresh testing phase and as @delboy711 states: there should be no interaction with the existing non-nvme function anyway. Plus from the issue #3013 (comment) we have:

I have therefore had a quick and dirty go at implementing smart for nvme drives and have a PR ready.
Functionality for SATA drives should be unaffected.

Also, well done on finding that tab/spaces pattern addition. This side of the SMART data does tent towards fragility - but of late we have had few failure reports. But of course the nvme output is all different again!!! So I guess we are now on another run of tidy-ups re variously outputs as-and-when we encounter them. But all good in time.

A small suggestion: maybe if we have no test report info, the spoofing of 1 min and 5 min could be reprenseted as "unknown" or "unreported" or the like: given that is what is the case here. Otherwise we essentially miss-lead folks into expecting a 1 or 5 minute test when in fact we have no idea :) . Also if you could add a description to the commit that indicates what was intended here that would be great: I don't like to depend only on GitHub references and we normally have our non-trivial commits contain the essense of what was attempted. This ensures that our git repo itself - that may not be in GitHub for-ever, will maintain a history of what was intended.

Since smartctl does not report available self tests or their duration for nvme drives,
this information is spoofed in the capabilities Tab.
@delboy711
Copy link
Author

I have found examples of nvme smartctl error logs on the web and have tested them with this PR, and they worked perfectly.
nvme_smart-06

@delboy711
Copy link
Author

While this PR does provide useful support for nvme drives in Rockstor, it could be better.
The 'Identity' Tab in particular leaves out useful information about the drive because there is not one to one mapping with the SATA field descriptions which are hard wired into Rockstor. The raw information from smartctl looks like this

Model Number:                       SAMSUNG MZVL21T0HCLR-00BH1
Serial Number:                      S641NX0Y213059
Firmware Version:                   HPS4NGXH
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 1,024,209,543,168 [1.02 TB]
Unallocated NVM Capacity:           0
Controller ID:                      6
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,024,209,543,168 [1.02 TB]
Namespace 1 Utilization:            82,113,654,784 [82.1 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 b251b11b0f
Local Time is:                      Thu Aug 28 08:53:45 2025 BST
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0057):     Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f):         S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     81 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     8.48W       -        -    0  0  0  0        0       0
 1 +     5.03W       -        -    1  1  1  1        0     200
 2 +     4.36W       -        -    2  2  2  2        0    1000
 3 -   0.0500W       -        -    3  3  3  3     2000    1200
 4 -   0.0050W       -        -    4  4  4  4      500    9500

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0
 1 -    4096       0         0

To get that information on the Identity Tab would require a rewrite of the parsing of smartctl and the model Classes, and would take some time. What do you think @phillxnet is the Identity Tab as shown here "good enough", or should I try to do better?

@phillxnet
Copy link
Member

@delboy711 Re:

What do you think @phillxnet is the Identity Tab as shown here "good enough" ...

I think any improvement on nothing for nvme devices (prior state before this patch) is a win; and as you say the next stage here

would require a rewrite of the parsing of smartctl and the model Classes

which is definitely a larger endeavour. However now you have identified, and implemented, the more obvious matches, it is now clearer how/where we have to adapt. Likely as you say this would be an extension to the existing models, and potentially skipping empty entries within the display. That is deeper than this first pass and so can easily qualify as a follow-up pull request; referencing this pull request as tending to the existing cross-over/commonality within our existing models and display.

@phillxnet
Copy link
Member

Let us know when you think this is ready for final review, and if possible a squash to a single commit would be good before hand. We can then get this into the wild, in testing, and move on to the model extension and parsing improvements as interest and resources (human mainly) present themselves :) . Plus we have some significant progress here already - within the current limitations of our existing structures anyway. Do feel free to extend and re-work as you see fit, but there is definitely no problem in a partial improvement, there are always improvements to be had.

When we first did our SMART info parsing & presentation it worked a treat on the simulated output from qemu sata devices, and for all drives held by the developers at that time. But once released into the wild we had a long string of non-conforming output that broke what we had in sometimes quite subtle ways. I'm anticipating the same with this move into nvme smart parsing. So all-in, when tending to real-hardware, and the myriad of ways manufacturers interpret more or less exact specifications, we have to wing-it a tad. Ergo bit-by-bit facilitates adapting our systems as and when folks show an interest in improving them either via feedback, or such as you have just done here, via pull requests.

Thanks again for taking a looks at this, and I'm sure you have now seen some of the adaptations we have had to make over the years in our existing parsing. In time, with sufficient attention, we should also have a full presentation for at least the common nvme manufacturers SMART output on these devices. But as always these things take time and testing on a range of real hardware to get to a polished state.

@kanecko
Copy link
Contributor

kanecko commented Sep 29, 2025

I suggest first finishing the improvement you've started now, and embark on the rewrite in a separate PR.

Not sure how far along this is, but I will gladly test the PR on my nvme disk.

@kanecko
Copy link
Contributor

kanecko commented Oct 3, 2025

Upon build & restart I get the following error:

[03/Oct/2025 21:00:00] INFO [scripts.initrock:476] ### BEGIN Establishing SMB config preexec update...
[03/Oct/2025 21:00:00] INFO [scripts.initrock:489] smb.conf preexec already updated
[03/Oct/2025 21:00:00] INFO [scripts.initrock:490] ### DONE Establishing SMB config preexec update...
[03/Oct/2025 21:00:05] ERROR [storageadmin.views.disk:376] Error running a command. cmd = /usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001. rc = 4. stdout = ['smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.70-default] (SUSE RPM)', 'Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org', '', '=== START OF INFORMATION SECTION ===', 'Model Number:                       CT250P2SSD8', 'Serial Number:                      2215E625DA44', 'Firmware Version:                   P2CR048', 'PCI Vendor/Subsystem ID:            0xc0a9', 'IEEE OUI Identifier:                0x00a075', 'Total NVM Capacity:                 250,059,350,016 [250 GB]', 'Unallocated NVM Capacity:           0', 'Controller ID:                      1', 'NVMe Version:                       1.3', 'Number of Namespaces:               1', 'Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]', 'Namespace 1 Formatted LBA Size:     512', 'Namespace 1 IEEE EUI-64:            00a075 6120000284', 'Local Time is:                      Fri Oct  3 21:00:05 2025 CEST', 'Firmware Updates (0x12):            1 Slot, no Reset required', 'Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test', 'Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp', 'Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg', 'Maximum Data Transfer Size:         64 Pages', 'Warning  Comp. Temp. Threshold:     70 Celsius', 'Critical Comp. Temp. Threshold:     85 Celsius', '', 'Supported Power States', 'St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat', ' 0 +     3.50W       -        -    0  0  0  0        0       0', ' 1 +     1.90W       -        -    1  1  1  1        0       0', ' 2 +     1.50W       -        -    2  2  2  2        0       0', ' 3 -   0.0700W       -        -    3  3  3  3     5000    1900', ' 4 -   0.0020W       -        -    4  4  4  4    13000  100000', '', 'Supported LBA Sizes (NSID 0x1)', 'Id Fmt  Data  Metadt  Rel_Perf', ' 0 +     512       0         1', ' 1 -    4096       0         0', '', '=== START OF SMART DATA SECTION ===', 'SMART overall-health self-assessment test result: PASSED', '', 'SMART/Health Information (NVMe Log 0x02)', 'Critical Warning:                   0x00', 'Temperature:                        40 Celsius', 'Available Spare:                    100%', 'Available Spare Threshold:          5%', 'Percentage Used:                    0%', 'Data Units Read:                    1,559,455 [798 GB]', 'Data Units Written:                 10,280,250 [5.26 TB]', 'Host Read Commands:                 22,076,117', 'Host Write Commands:                192,975,250', 'Controller Busy Time:               523', 'Power Cycles:                       21', 'Power On Hours:                     23,621', 'Unsafe Shutdowns:                   11', 'Media and Data Integrity Errors:    0', 'Error Information Log Entries:      280,687', 'Warning  Comp. Temperature Time:    0', 'Critical Comp. Temperature Time:    0', 'Temperature Sensor 1:               57 Celsius', '', 'Error Information (NVMe Log 0x01, 16 of 16 entries)', 'Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message', '  0     280687     0  0x2017  0x4005      -            0     0     -  Invalid Field in Command', '  1     280686     0  0x2016  0x4005      -            0     0     -  Invalid Field in Command', '  2     280685     0  0x4012  0x4005      -            0     0     -  Invalid Field in Command', '  3     280684     0  0x4011  0x4005      -            0     0     -  Invalid Field in Command', '  4     280683     0  0x001a  0x4005      -            0     0     -  Invalid Field in Command', '  5     280682     0  0x0019  0x4005      -            0     0     -  Invalid Field in Command', '  6     280681     0  0xd006  0x4005      -            0     0     -  Invalid Field in Command', '  7     280680     0  0xd005  0x4005      -            0     0     -  Invalid Field in Command', '  8     280679     0  0xc000  0x4004      -            0     0     -  Invalid Field in Command', '  9     280678     0  0xb003  0x4004      -            0     0     -  Invalid Field in Command', ' 10     280677     0  0x2014  0x4004      -            0     0     -  Invalid Field in Command', ' 11     280676     0  0xe019  0x4004      -            0     0     -  Invalid Field in Command', ' 12     280675     0  0xd019  0x4004      -            0     0     -  Invalid Field in Command', ' 13     280674     0  0xd018  0x4004      -            0     0     -  Invalid Field in Command', ' 14     280673     0  0xb01a  0x4004      -            0     0     -  Invalid Field in Command', ' 15     280672     0  0xb019  0x4004      -            0     0     -  Invalid Field in Command', '', 'Read Self-test Log failed: Invalid Field in Command (0x2002)', '', '']. stderr = ['']
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 372, in _update_disk_state
    db_disk.smart_available, db_disk.smart_enabled = smart.available(
                                                     ^^^^^^^^^^^^^^^^
  File "/opt/rockstor/src/rockstor/system/smart.py", line 354, in available
    o, e, rc = run_command(
               ^^^^^^^^^^^^
  File "/opt/rockstor/src/rockstor/system/osi.py", line 289, in run_command
    raise CommandException(cmd, out, err, rc)
system.exceptions.CommandException: Error running a command. cmd = /usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001. rc = 4. stdout = ['smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.70-default] (SUSE RPM)', 'Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org', '', '=== START OF INFORMATION SECTION ===', 'Model Number:                       CT250P2SSD8', 'Serial Number:                      2215E625DA44', 'Firmware Version:                   P2CR048', 'PCI Vendor/Subsystem ID:            0xc0a9', 'IEEE OUI Identifier:                0x00a075', 'Total NVM Capacity:                 250,059,350,016 [250 GB]', 'Unallocated NVM Capacity:           0', 'Controller ID:                      1', 'NVMe Version:                       1.3', 'Number of Namespaces:               1', 'Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]', 'Namespace 1 Formatted LBA Size:     512', 'Namespace 1 IEEE EUI-64:            00a075 6120000284', 'Local Time is:                      Fri Oct  3 21:00:05 2025 CEST', 'Firmware Updates (0x12):            1 Slot, no Reset required', 'Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test', 'Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp', 'Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg', 'Maximum Data Transfer Size:         64 Pages', 'Warning  Comp. Temp. Threshold:     70 Celsius', 'Critical Comp. Temp. Threshold:     85 Celsius', '', 'Supported Power States', 'St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat', ' 0 +     3.50W       -        -    0  0  0  0        0       0', ' 1 +     1.90W       -        -    1  1  1  1        0       0', ' 2 +     1.50W       -        -    2  2  2  2        0       0', ' 3 -   0.0700W       -        -    3  3  3  3     5000    1900', ' 4 -   0.0020W       -        -    4  4  4  4    13000  100000', '', 'Supported LBA Sizes (NSID 0x1)', 'Id Fmt  Data  Metadt  Rel_Perf', ' 0 +     512       0         1', ' 1 -    4096       0         0', '', '=== START OF SMART DATA SECTION ===', 'SMART overall-health self-assessment test result: PASSED', '', 'SMART/Health Information (NVMe Log 0x02)', 'Critical Warning:                   0x00', 'Temperature:                        40 Celsius', 'Available Spare:                    100%', 'Available Spare Threshold:          5%', 'Percentage Used:                    0%', 'Data Units Read:                    1,559,455 [798 GB]', 'Data Units Written:                 10,280,250 [5.26 TB]', 'Host Read Commands:                 22,076,117', 'Host Write Commands:                192,975,250', 'Controller Busy Time:               523', 'Power Cycles:                       21', 'Power On Hours:                     23,621', 'Unsafe Shutdowns:                   11', 'Media and Data Integrity Errors:    0', 'Error Information Log Entries:      280,687', 'Warning  Comp. Temperature Time:    0', 'Critical Comp. Temperature Time:    0', 'Temperature Sensor 1:               57 Celsius', '', 'Error Information (NVMe Log 0x01, 16 of 16 entries)', 'Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message', '  0     280687     0  0x2017  0x4005      -            0     0     -  Invalid Field in Command', '  1     280686     0  0x2016  0x4005      -            0     0     -  Invalid Field in Command', '  2     280685     0  0x4012  0x4005      -            0     0     -  Invalid Field in Command', '  3     280684     0  0x4011  0x4005      -            0     0     -  Invalid Field in Command', '  4     280683     0  0x001a  0x4005      -            0     0     -  Invalid Field in Command', '  5     280682     0  0x0019  0x4005      -            0     0     -  Invalid Field in Command', '  6     280681     0  0xd006  0x4005      -            0     0     -  Invalid Field in Command', '  7     280680     0  0xd005  0x4005      -            0     0     -  Invalid Field in Command', '  8     280679     0  0xc000  0x4004      -            0     0     -  Invalid Field in Command', '  9     280678     0  0xb003  0x4004      -            0     0     -  Invalid Field in Command', ' 10     280677     0  0x2014  0x4004      -            0     0     -  Invalid Field in Command', ' 11     280676     0  0xe019  0x4004      -            0     0     -  Invalid Field in Command', ' 12     280675     0  0xd019  0x4004      -            0     0     -  Invalid Field in Command', ' 13     280674     0  0xd018  0x4004      -            0     0     -  Invalid Field in Command', ' 14     280673     0  0xb01a  0x4004      -            0     0     -  Invalid Field in Command', ' 15     280672     0  0xb019  0x4004      -            0     0     -  Invalid Field in Command', '', 'Read Self-test Log failed: Invalid Field in Command (0x2002)', '', '']. stderr = ['']
[03/Oct/2025 21:00:13] ERROR [storageadmin.views.disk:376] Error running a command. cmd = /usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001. rc = 4. stdout = ['smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.70-default] (SUSE RPM)', 'Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org', '', '=== START OF INFORMATION SECTION ===', 'Model Number:                       CT250P2SSD8', 'Serial Number:                      2215E625DA44', 'Firmware Version:                   P2CR048', 'PCI Vendor/Subsystem ID:            0xc0a9', 'IEEE OUI Identifier:                0x00a075', 'Total NVM Capacity:                 250,059,350,016 [250 GB]', 'Unallocated NVM Capacity:           0', 'Controller ID:                      1', 'NVMe Version:                       1.3', 'Number of Namespaces:               1', 'Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]', 'Namespace 1 Formatted LBA Size:     512', 'Namespace 1 IEEE EUI-64:            00a075 6120000284', 'Local Time is:                      Fri Oct  3 21:00:13 2025 CEST', 'Firmware Updates (0x12):            1 Slot, no Reset required', 'Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test', 'Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp', 'Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg', 'Maximum Data Transfer Size:         64 Pages', 'Warning  Comp. Temp. Threshold:     70 Celsius', 'Critical Comp. Temp. Threshold:     85 Celsius', '', 'Supported Power States', 'St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat', ' 0 +     3.50W       -        -    0  0  0  0        0       0', ' 1 +     1.90W       -        -    1  1  1  1        0       0', ' 2 +     1.50W       -        -    2  2  2  2        0       0', ' 3 -   0.0700W       -        -    3  3  3  3     5000    1900', ' 4 -   0.0020W       -        -    4  4  4  4    13000  100000', '', 'Supported LBA Sizes (NSID 0x1)', 'Id Fmt  Data  Metadt  Rel_Perf', ' 0 +     512       0         1', ' 1 -    4096       0         0', '', '=== START OF SMART DATA SECTION ===', 'SMART overall-health self-assessment test result: PASSED', '', 'SMART/Health Information (NVMe Log 0x02)', 'Critical Warning:                   0x00', 'Temperature:                        41 Celsius', 'Available Spare:                    100%', 'Available Spare Threshold:          5%', 'Percentage Used:                    0%', 'Data Units Read:                    1,559,664 [798 GB]', 'Data Units Written:                 10,280,331 [5.26 TB]', 'Host Read Commands:                 22,076,987', 'Host Write Commands:                192,978,732', 'Controller Busy Time:               523', 'Power Cycles:                       21', 'Power On Hours:                     23,621', 'Unsafe Shutdowns:                   11', 'Media and Data Integrity Errors:    0', 'Error Information Log Entries:      280,688', 'Warning  Comp. Temperature Time:    0', 'Critical Comp. Temperature Time:    0', 'Temperature Sensor 1:               57 Celsius', '', 'Error Information (NVMe Log 0x01, 16 of 16 entries)', 'Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message', '  0     280688     0  0x8007  0x4004  0x004            0     1     -  Invalid Field in Command', '  1     280687     0  0x2017  0x4005      -            0     0     -  Invalid Field in Command', '  2     280686     0  0x2016  0x4005      -            0     0     -  Invalid Field in Command', '  3     280685     0  0x4012  0x4005      -            0     0     -  Invalid Field in Command', '  4     280684     0  0x4011  0x4005      -            0     0     -  Invalid Field in Command', '  5     280683     0  0x001a  0x4005      -            0     0     -  Invalid Field in Command', '  6     280682     0  0x0019  0x4005      -            0     0     -  Invalid Field in Command', '  7     280681     0  0xd006  0x4005      -            0     0     -  Invalid Field in Command', '  8     280680     0  0xd005  0x4005      -            0     0     -  Invalid Field in Command', '  9     280679     0  0xc000  0x4004      -            0     0     -  Invalid Field in Command', ' 10     280678     0  0xb003  0x4004      -            0     0     -  Invalid Field in Command', ' 11     280677     0  0x2014  0x4004      -            0     0     -  Invalid Field in Command', ' 12     280676     0  0xe019  0x4004      -            0     0     -  Invalid Field in Command', ' 13     280675     0  0xd019  0x4004      -            0     0     -  Invalid Field in Command', ' 14     280674     0  0xd018  0x4004      -            0     0     -  Invalid Field in Command', ' 15     280673     0  0xb01a  0x4004      -            0     0     -  Invalid Field in Command', '', 'Read Self-test Log failed: Invalid Field in Command (0x2002)', '', '']. stderr = ['']
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 372, in _update_disk_state
    db_disk.smart_available, db_disk.smart_enabled = smart.available(
                                                     ^^^^^^^^^^^^^^^^
  File "/opt/rockstor/src/rockstor/system/smart.py", line 354, in available
    o, e, rc = run_command(
               ^^^^^^^^^^^^
  File "/opt/rockstor/src/rockstor/system/osi.py", line 289, in run_command
    raise CommandException(cmd, out, err, rc)
system.exceptions.CommandException: Error running a command. cmd = /usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001. rc = 4. stdout = ['smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.70-default] (SUSE RPM)', 'Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org', '', '=== START OF INFORMATION SECTION ===', 'Model Number:                       CT250P2SSD8', 'Serial Number:                      2215E625DA44', 'Firmware Version:                   P2CR048', 'PCI Vendor/Subsystem ID:            0xc0a9', 'IEEE OUI Identifier:                0x00a075', 'Total NVM Capacity:                 250,059,350,016 [250 GB]', 'Unallocated NVM Capacity:           0', 'Controller ID:                      1', 'NVMe Version:                       1.3', 'Number of Namespaces:               1', 'Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]', 'Namespace 1 Formatted LBA Size:     512', 'Namespace 1 IEEE EUI-64:            00a075 6120000284', 'Local Time is:                      Fri Oct  3 21:00:13 2025 CEST', 'Firmware Updates (0x12):            1 Slot, no Reset required', 'Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test', 'Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp', 'Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg', 'Maximum Data Transfer Size:         64 Pages', 'Warning  Comp. Temp. Threshold:     70 Celsius', 'Critical Comp. Temp. Threshold:     85 Celsius', '', 'Supported Power States', 'St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat', ' 0 +     3.50W       -        -    0  0  0  0        0       0', ' 1 +     1.90W       -        -    1  1  1  1        0       0', ' 2 +     1.50W       -        -    2  2  2  2        0       0', ' 3 -   0.0700W       -        -    3  3  3  3     5000    1900', ' 4 -   0.0020W       -        -    4  4  4  4    13000  100000', '', 'Supported LBA Sizes (NSID 0x1)', 'Id Fmt  Data  Metadt  Rel_Perf', ' 0 +     512       0         1', ' 1 -    4096       0         0', '', '=== START OF SMART DATA SECTION ===', 'SMART overall-health self-assessment test result: PASSED', '', 'SMART/Health Information (NVMe Log 0x02)', 'Critical Warning:                   0x00', 'Temperature:                        41 Celsius', 'Available Spare:                    100%', 'Available Spare Threshold:          5%', 'Percentage Used:                    0%', 'Data Units Read:                    1,559,664 [798 GB]', 'Data Units Written:                 10,280,331 [5.26 TB]', 'Host Read Commands:                 22,076,987', 'Host Write Commands:                192,978,732', 'Controller Busy Time:               523', 'Power Cycles:                       21', 'Power On Hours:                     23,621', 'Unsafe Shutdowns:                   11', 'Media and Data Integrity Errors:    0', 'Error Information Log Entries:      280,688', 'Warning  Comp. Temperature Time:    0', 'Critical Comp. Temperature Time:    0', 'Temperature Sensor 1:               57 Celsius', '', 'Error Information (NVMe Log 0x01, 16 of 16 entries)', 'Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message', '  0     280688     0  0x8007  0x4004  0x004            0     1     -  Invalid Field in Command', '  1     280687     0  0x2017  0x4005      -            0     0     -  Invalid Field in Command', '  2     280686     0  0x2016  0x4005      -            0     0     -  Invalid Field in Command', '  3     280685     0  0x4012  0x4005      -            0     0     -  Invalid Field in Command', '  4     280684     0  0x4011  0x4005      -            0     0     -  Invalid Field in Command', '  5     280683     0  0x001a  0x4005      -            0     0     -  Invalid Field in Command', '  6     280682     0  0x0019  0x4005      -            0     0     -  Invalid Field in Command', '  7     280681     0  0xd006  0x4005      -            0     0     -  Invalid Field in Command', '  8     280680     0  0xd005  0x4005      -            0     0     -  Invalid Field in Command', '  9     280679     0  0xc000  0x4004      -            0     0     -  Invalid Field in Command', ' 10     280678     0  0xb003  0x4004      -            0     0     -  Invalid Field in Command', ' 11     280677     0  0x2014  0x4004      -            0     0     -  Invalid Field in Command', ' 12     280676     0  0xe019  0x4004      -            0     0     -  Invalid Field in Command', ' 13     280675     0  0xd019  0x4004      -            0     0     -  Invalid Field in Command', ' 14     280674     0  0xd018  0x4004      -            0     0     -  Invalid Field in Command', ' 15     280673     0  0xb01a  0x4004      -            0     0     -  Invalid Field in Command', '', 'Read Self-test Log failed: Invalid Field in Command (0x2002)', '', '']. stderr = ['']

image image

@delboy711
Copy link
Author

Thanks for your feedback @kanecko

Could you please post the results of the command

/usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001

It will make the message a bit easier to read.
Thanks

@delboy711
Copy link
Author

delboy711 commented Oct 4, 2025

I tried this PR today with an nvme drive connected by a USB adapter. The drive was recognised OK by Rockstor and could be formatted and used, but smartctl errors with the message

Unknown USB bridge [0x152d:0xa583 (0x214)]
Unable to detect device type
Please specify device type with the -d option.

On giving the option '-d sntjmicron' , smartctl then worked normally.

On further investigation I found that smartmontools 7.5 on my Arch Linux desktop works Ok, but version 7.4 as provided in Suse Leap 15-6 does not recognise newer USB adapters.

@kanecko
Copy link
Contributor

kanecko commented Oct 4, 2025

Thanks for your feedback @kanecko

Could you please post the results of the command

/usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001

It will make the message a bit easier to read. Thanks

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.4.0-150600.23.70-default] (SUSE RPM)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF INFORMATION SECTION ===
Model Number:                       CT250P2SSD8
Firmware Version:                   P2CR048
PCI Vendor/Subsystem ID:            0xc0a9
IEEE OUI Identifier:                0x00a075
Total NVM Capacity:                 250,059,350,016 [250 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            00a075 6120000284
Local Time is:                      Sat Oct  4 18:53:07 2025 CEST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     85 Celsius


Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W       -        -    0  0  0  0        0       0
 1 +     1.90W       -        -    1  1  1  1        0       0
 2 +     1.50W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3     5000    1900
 4 -   0.0020W       -        -    4  4  4  4    13000  100000


Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1
 1 -    4096       0         0


=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        38 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    0%
Data Units Read:                    1,566,530 [802 GB]
Data Units Written:                 10,302,049 [5.27 TB]
Host Read Commands:                 22,380,096
Host Write Commands:                194,058,124
Controller Busy Time:               526
Power Cycles:                       21
Power On Hours:                     23,643
Unsafe Shutdowns:                   11
Media and Data Integrity Errors:    0
Error Information Log Entries:      281,472
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               51 Celsius


Error Information (NVMe Log 0x01, 16 of 16 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message
  0     281472     0  0x7018  0x4004  0x004            0     1     -  Invalid Field in Command
  1     281471     0  0x3019  0x4004      -            0     0     -  Invalid Field in Command
  2     281470     0  0x3018  0x4004      -            0     0     -  Invalid Field in Command
  3     281469     0  0x201a  0x4004      -            0     0     -  Invalid Field in Command
  4     281468     0  0x2019  0x4004      -            0     0     -  Invalid Field in Command
  5     281467     0  0xb00a  0x4004      -            0     0     -  Invalid Field in Command
  6     281466     0  0xb009  0x4004      -            0     0     -  Invalid Field in Command
  7     281465     0  0x5004  0x4004      -            0     0     -  Invalid Field in Command
  8     281464     0  0x4007  0x4004      -            0     0     -  Invalid Field in Command
  9     281463     0  0x9019  0x4005      -            0     0     -  Invalid Field in Command
 10     281462     0  0x9018  0x4005      -            0     0     -  Invalid Field in Command
 11     281461     0  0xd014  0x4005      -            0     0     -  Invalid Field in Command
 12     281460     0  0xc017  0x4005      -            0     0     -  Invalid Field in Command
 13     281459     0  0x701b  0x4005      -            0     0     -  Invalid Field in Command
 14     281458     0  0x701a  0x4005      -            0     0     -  Invalid Field in Command
 15     281457     0  0x6019  0x4004      -            0     0     -  Invalid Field in Command


Read Self-test Log failed: Invalid Field in Command (0x2002)

@delboy711
Copy link
Author

Thanks @kanecko

Looks to be a known bug in smartmon tools https://github.com/smartmontools/smartmontools/issues/217

Could you try each of

smartctl -l /dev/nvme0n1
smartctl -l selftest /dev/nvme0
smartctl -d nvme,0xffffffff -l selftest /dev/nvme0n1
smartctl -d /dev/nvme,0xffffffff -l selftest /dev/nvme0n1

No need to post the results. Just report if each gives the error ' Invalid Field in Command '

Also if you have access to another computer with smartmontools 7.5 we can confirm if it is already fixed in the latest release.

@kanecko
Copy link
Contributor

kanecko commented Oct 4, 2025

smartmontool 7.4 results:

smartctl -l /dev/nvme0n1
-> invalid argument error
smartctl -l selftest /dev/nvme0
-> reports "Invalid Field in Command"
smartctl -d nvme,0xffffffff -l selftest /dev/nvme0n1
-> works fine
smartctl -d /dev/nvme,0xffffffff -l selftest /dev/nvme0n1
-> invalid argument error

upgraded to smartmontool 7.5 and retried all cmds:
/usr/sbin/smartctl --all /dev/disk/by-id/nvme-nvme.c0a9-323231354536323544413434-4354323530503253534438-00000001

smartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.4.0-150600.23.70-default] (SUSE RPM)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF INFORMATION SECTION ===
Model Number:                       CT250P2SSD8
Firmware Version:                   P2CR048
PCI Vendor/Subsystem ID:            0xc0a9
IEEE OUI Identifier:                0x00a075
Total NVM Capacity:                 250,059,350,016 [250 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            00a075 6120000284
Local Time is:                      Sat Oct  4 19:40:34 2025 CEST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     85 Celsius


Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W       -        -    0  0  0  0        0       0
 1 +     1.90W       -        -    1  1  1  1        0       0
 2 +     1.50W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3     5000    1900
 4 -   0.0020W       -        -    4  4  4  4    13000  100000


Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1
 1 -    4096       0         0


=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        37 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    0%
Data Units Read:                    1,567,487 [802 GB]
Data Units Written:                 10,302,776 [5.27 TB]
Host Read Commands:                 22,397,550
Host Write Commands:                194,087,968
Controller Busy Time:               526
Power Cycles:                       21
Power On Hours:                     23,644
Unsafe Shutdowns:                   11
Media and Data Integrity Errors:    0
Error Information Log Entries:      281,480
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               50 Celsius


Error Information (NVMe Log 0x01, 16 of 16 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message
  0     281480     0  0xd012  0x4005  0x004            0     1     -  Invalid Field in Command
  1     281479     0  0xa018  0x4005      -            0     0     -  Invalid Field in Command
  2     281478     0  0x901b  0x4004      -            0     0     -  Invalid Field in Command
  3     281477     0  0xc011  0x4004      -            0     0     -  Invalid Field in Command
  4     281476     0  0xc010  0x4004      -            0     0     -  Invalid Field in Command
  5     281475     0  0xa012  0x4004      -            0     0     -  Invalid Field in Command
  6     281474     0  0xa011  0x4004      -            0     0     -  Invalid Field in Command
  7     281473     0  0x8019  0x4004  0x004            0     1     -  Invalid Field in Command
  8     281472     0  0x7018  0x4004  0x004            0     1     -  Invalid Field in Command
  9     281471     0  0x3019  0x4004      -            0     0     -  Invalid Field in Command
 10     281470     0  0x3018  0x4004      -            0     0     -  Invalid Field in Command
 11     281469     0  0x201a  0x4004      -            0     0     -  Invalid Field in Command
 12     281468     0  0x2019  0x4004      -            0     0     -  Invalid Field in Command
 13     281467     0  0xb00a  0x4004      -            0     0     -  Invalid Field in Command
 14     281466     0  0xb009  0x4004      -            0     0     -  Invalid Field in Command
 15     281465     0  0x5004  0x4004      -            0     0     -  Invalid Field in Command


Self-test Log (NVMe Log 0x06, NSID 0xffffffff)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours  Failing_LBA  NSID Seg SCT Code
 0   Short             Completed without error                 431            -     -   -   -    -
 1   Extended          Completed without error                 262            -     -   -   -    -
 2   Short             Completed without error                 262            -     -   -   -    -
 3   Short             Completed without error                   1            -     -   -   -    -

smartctl -l /dev/nvme0n1
-> invalid argument error
smartctl -l selftest /dev/nvme0
-> works fine
smartctl -d nvme,0xffffffff -l selftest /dev/nvme0n1
-> works fine
smartctl -d /dev/nvme,0xffffffff -l selftest /dev/nvme0n1
-> invalid argument error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载