Open Source NVMe® SSD Management Utility – NVMe Command Line Interface (NVMe-CLI)

Blog

Jonmichael Hands, VP Storage, Chia Network

NVM Express® (NVMe®) technology has enabled a robust set of industry-standard software, drivers, and management tools that have been developed for storage. The tool to manage NVMe SSDs in Linux is called NVMe Command Line Interface (NVMe-CLI).

Overview of features

Data centers require many management functions to monitor the health of the SSD, monitor endurance, update firmware, securely erase storage and read various logs. NVMe-CLI is an open-source, powerful feature set that follows the NVMe specification and is supported by all major distributions. It supports NVMe SSDs as well as NVMe over Fabrics (NVMe-oF™) architecture and offers optional vendor plugins for supplemental information above and beyond the specification.

You can search the Linux man page for help, but it won’t be enough to understand the capabilities of NVMe-CLI. The good news is all the commands in NVMe-CLI directly match the spec! All you need to do is download a copy of the latest NVMe spec to be able to interpret the abbreviations for the various commands.

All the abbreviations in the output of the NVMe commands can be found in the specification. For instance, for the Identify Controller data structure, you can send the command nvme-id-ctrl in NVMe-CLI. The output will have abbreviations for the various fields

Example: Model Number (MN) is displayed in NVMe-CLI as mn

NVMe-CLI can is obtained as a package for all the Linux distributions. The Github page has instructions for all the distributions, example for Ubuntu / Debian is below.

sudo apt install nvme-cli

You can also head over to the releases page and get the most stable build, while you will have to follow the instructions on the GitHub to compile and install for your distribution. On Ubuntu 22.04, the compiling would look like this.

wget https://github.com/linux-nvme/nvme-cli/archive/refs/tags/v2.3.tar.gz

tar -xvf v2.3.tar.gz

cd nvme-cli-2.3

sudo apt update

sudo apt install meson

sudo apt install ninja-build

sudo meson .build

sudo ninja -C .build

sudo meson install -C .build

Here is the cheat sheet of the most commonly used commands. Remember NVMe-CLI is powerful and can do almost anything that the NVMe specification calls out if used correctly. We will go into all these commands in detail.

Cheat Sheet

nvme versionDisplay the current version
nvme listLists all the NVMe SSDs attached: name, serial number, size, LBA format, and serial
nvme id-ctrlDiscover information about NVMe controller and features it supports
nvme id-nsDiscover optimal namespace size, protection information, LBA size
nvme formatSecure erase the data on an SSD, format an LBA size or protection information for end-to-end data protection
nvme sanitizeSecurely eliminate all data on device, cannot be stopped. Supports block, crypto, and overwrite
nvme smart-logHealth of the SSD (critical warning info), temperature, endurance, power on hours and error summary
nvme error-logA log that contains information about errors encountered
nvme resetResets the NVMe controller
nvme create-nsCreate a namespace, can be used for overproviosning an SSD
nvme delete-nsRemove a namespace
nvme device-self-testSimple test for health of a drive, pass/fail
nvme fw-download, fw-commitDownload firmware to the drive, update the firmware on the drive
nvme helpLists all the available commands

Basics – Version, List, and Learning About the Capabilities of Attached NVMe Controllers / SSDs

NVMe Version

nvme version

nvme version 2.3 (git 2.3)

libnvme version 1.3 (git 1.3)

NVMe List

sudo nvme list

Identify Controller

The identify controller command is used to learn about the capabilities of the NVMe controllers (in most cases, this is the capabilities of an NVMe SSD). Instead of guessing which features a vendor supports, they are all neatly laid out in the capabilities field. Other useful information includes drive model, vendor, firmware version, etc., which all have abbreviations called out in the NVMe spec.

You can run the identify controller command with

sudo nvme id-ns /dev/nvme0n1 -H

And look for 0x1 to display features that the drive supports. Example from some output on a WD SN840

oacs      : 0x5f

[10:10] : 0   Lockdown Command and Feature Not Supported

[9:9] : 0     Get LBA Status Capability Not Supported

[8:8] : 0     Doorbell Buffer Config Not Supported

[7:7] : 0     Virtualization Management Not Supported

[6:6] : 0x1   NVMe-MI Send and Receive Supported

[5:5] : 0     Directives Not Supported

[4:4] : 0x1   Device Self-test Supported

[3:3] : 0x1   NS Management and Attachment Supported

[2:2] : 0x1   FW Commit and Download Supported

[1:1] : 0x1   Format NVM Supported

[0:0] : 0x1   Security Send and Receive Supported

Identify Namespace

Namespaces are the construct in NVMe technology that hold user data. An NVMe controller can have multiple namespaces attached to it. Most NVMe SSDs today just use a single namespace, but multi-tenant applications, virtualization and security have use cases for multiple namespaces.

You can find out the size of the namespace and the namespace utilization (NUSE) is useful for generating reports on the percentage of LBAs that are being used. There is a lot of useful data in the identify namespace command that can be used by host software to optimize performance, data integrity, TRIM (deallocate), LBA size (e.g. 512B, 4kB) and more. Read the NVMe specification for Namespaces and Identify Namespaces for all the detailed capabilities.

Health Monitoring

SMART Log

The most commonly used command in the NVMe-CLI is likely the smart-log command, which monitors health, temperature, status, etc. through NVMe SMART. The critical warning bit will tell you if anything is wrong with the drive. Anything that is not zero here means there is a failure, the -H on the command will give the breakdown of bits of what may be wrong with the drive. The percentage used is the most important endurance attribute, as this will increment as the cycles of the NAND on an SSD get worn out. Data Units Written will report the number of 512 * 1000 bytes read, and is now parsed out to SI units (terabytes, petabytes). You will see the total amount of endurance an SSD can support in the specification sheet as TBW (Terabytes Written) or DWPD (Drive Writes per Day)

sudo nvme smart-log /dev/nvme0n1 -H

Smart Log for NVME device:nvme2n1 namespace-id:ffffffff

critical_warning                        : 0

Available Spare[0]             : 0

Temp. Threshold[1]             : 0

NVM subsystem Reliability[2]   : 0

Read-only[3]                   : 0

Volatile mem. backup failed[4] : 0

Persistent Mem. RO[5]          : 0

temperature                             : 37°C (310 Kelvin)

available_spare                         : 100%

available_spare_threshold               : 10%

percentage_used                         : 13%

endurance group critical warning summary: 0

Data Units Read                         : 2,007,752,408 (1.03 PB)

Data Units Written                      : 2,302,301,144 (1.18 PB)

host_read_commands                      : 5,252,462,107

host_write_commands                     : 5,522,436,731

controller_busy_time                    : 58,569

power_cycles                            : 73

power_on_hours                          : 8,348

unsafe_shutdowns                        : 19

media_errors                            : 0

num_err_log_entries                     : 500

Warning Temperature Time                : 0

Critical Composite Temperature Time     : 0

Temperature Sensor 1           : 34°C (307 Kelvin)

Temperature Sensor 2           : 37°C (310 Kelvin)

Temperature Sensor 3           : 36°C (309 Kelvin)

Thermal Management T1 Trans Count       : 0

Thermal Management T2 Trans Count       : 0

Thermal Management T1 Total Time        : 0

Thermal Management T2 Total Time        : 0

Device Self-Test

A device self-test operation is a diagnostic testing sequence that tests the integrity and functionality of the controller and may include testing of the media associated with namespaces. A short device self-test operation should complete in two minutes or less. An extended device self-test operation should complete in the time indicated in the Extended Device Self-test Time field in the Identify Controller data structure or less.

SegmentTest PerformedFailure Criteria
1 – RAM CheckWrite a test pattern to RAM, followed by a read and compare of the original data.Any uncorrectable error or data miscompare
2 – SMART CheckCheck SMART or health status for Critical Warning bits set to ‘1’ in SMART / Health Information Log.Any Critical Warning bit set to ‘1’ fails this segment
3 – Volatile memory backupValidate volatile memory backup solution health (e.g., measure backup power source charge and/or discharge time).Significant degradation in backup capability
4 – Metadata validationConfirm/validate all copies of metadata.Metadata is corrupt and is not recoverable
5 – NVM integrityWrite/read/compare to reserved areas of each NVM. Ensure also that every read/write channel of the controller is exercised.Data miscompare
6 – Data IntegrityPerform background housekeeping tasks, prioritizing actions that enhance the integrity of stored data.

 

Exit this segment in time to complete the remaining segments and meet the timing requirements for extended device self-test operation indicated in the Identify Controller data structure.

Metadata is corrupt and is not recoverable
7 – Media CheckPerform random reads from every available good physical block.

 

Exit this segment in time to complete the remaining segments. The time to complete is dependent on the type of device self-test operation.

Inability to access a physical block
8 – Drive LifeEnd-of-life condition: Assess the drive’s suitability for continuing write operations.The Percentage Used is set to 255 in the SMART / Health Information Log or an analysis of internal key operating parameters indicates that data is at risk if writing continues
9 – SMART CheckSame as 2 – SMART Check 

Run a device self-test, short, and read the log page

sudo nvme device-self-test -s 1 /dev/nvme2n1

sudo nvme self-test-log /dev/nvme2n1 -v

Device Self Test Log for NVME device:nvme2n1

Current operation  : 0

Current Completion : 0%

Self Test Result[0]:

Operation Result             : 0 Operation completed without error

Self Test Code               : 1 Short device self-test operation

Valid Diagnostic Information : 0

Power on hours (POH)         : 0x20ab

Vendor Specific              : 0 0x20

Error Log Page

Entries in here don’t mean the drive is failed, but they can be helpful to debug for folks developing filesystems, drivers, or other software. Some drives will clear the error log page after a sanitize command.

sudo nvme error-log /dev/nvme0n1

Error Log Entries for device:nvme2n1 entries:64

……………..

Entry[ 0]

……………..

error_count     : 500

sqid            : 0

cmdid           : 0x1008

status_field    : 0x6002(Invalid Field in Command: A reserved coded value or an unsupported value in a defined field)

phase_tag       : 0

parm_err_loc    : 0x28

lba             : 0

nsid            : 0

vs              : 0

trtype          : The transport type is not indicated or the error is not transport related.

cs              : 0

trtype_spec_info: 0

……………..

Device Management

Update device firmware

SSD vendors will typically release new firmware over the production period of the SSD. It is not uncommon to see four to five updates during a five year deployment of an SSD. Firmware updates ensure the most up to date security patches, bug fixes, and reliability improvements. An OEM generally handles firmware updates with their management tools and cryptographically signed firmware images that match the OEM, but NVMe SSDs obtained with generic firmware from a channel partner or distributor can be updated. Ask your SSD vendor for the latest firmware version.

Instead of describing the process here, please visit the section in the NVMe spec Firmware Update Process. This will go over where resets are needed, the concept of firmware slots – some NVMe SSDs can have multiple copies of firmware on the device and you can activate a specific copy to run. Generally speaking, most SSDs have redundant copies of the same image for security purposes.

Find current fw revision with `nvme list`

Download firmware to target drive

sudo nvme fw-download -f vendorfirmware.bin /dev/nvme0n1

Here is what the different commit actions do (-a), as you can see they nicely match the spec table.

0: Downloaded image replaces the image indicated by the Firmware Slot field. This image is not activated.

1: Downloaded image replaces the image indicated by the Firmware Slot field. This image is activated at the next reset.

2: The image indicated by the Firmware Slot field is activated at the next reset.

3: The image specified by the Firmware Slot field is requested to be activated immediately without reset

The input -s can be used for a specific slot.

sudo nvme fw-commit -s 0 -a 3 /dev/nvme0n1

After the firmware download, you may need a reset of the drive if you used action 2, if the device does not support firmware activation without reset (action 3)

Media Sanitization and NVMe Format

These commands are used to securely erase user data from the device. This can be used when deploying a new device, retiring or at device end-of-life, using an SSD for a new application and so on. There are a few variations we will cover. Sanitize was introduced in NVMe 1.3 specification, so before then NVMe Format was used exclusively to perform secure erase. While both options work, Sanitize is more robust for ensuring the data was properly wiped; format is good for everyday use and testing.

Format

0: No Secure Erase operation requested (generally speaking, this just TRIMs/deallocates all the LBAs)

1: User Data Erase – this physically erases the data on the drive. In a mainstream NAND NVMe SSD, this will trigger the erase of all the blocks as well as changing the cryptographic key. Due to the physics of NAND erases (consuming power and time) this can take some time (for large drives measured in single digit minutes)

2: Crypto Erase, this completes much faster (under 1 second in most cases) by swapping out the cryptographic key so that all data is rendered unreadable. Like all three cases, this will deallocate the LBAs and some drives may support deterministic read zero after TRIM for subsequent reads.

Remember when we learned about identify controller (id-ctrl)? This will come in handy seeing what type of secure erase the NVMe SSD supports. Check the Optional Admin Command Support (OACS) Bit 1 for if format NVM is supported or not

sudo nvme id-ctrl -H /dev/nvme2n1 | grep oacs -A 11

oacs      : 0x5f

[10:10] : 0   Lockdown Command and Feature Not Supported

[9:9] : 0     Get LBA Status Capability Not Supported

[8:8] : 0     Doorbell Buffer Config Not Supported

[7:7] : 0     Virtualization Management Not Supported

[6:6] : 0x1   NVMe-MI Send and Receive Supported

[5:5] : 0     Directives Not Supported

[4:4] : 0x1   Device Self-test Supported

[3:3] : 0x1   NS Management and Attachment Supported

[2:2] : 0x1   FW Commit and Download Supported

[1:1] : 0x1   Format NVM Supported

[0:0] : 0x1   Security Send and Receive Supported

Changing LBA Format – this is set via the NVMe-format command, but you can use identify namespace to check the LBA formats and sizes that the drive supports, and find out which is recommended by the SSD firmware

Check Formatted LBA Size (FLBAS)

sudo nvme id-ns -H /dev/nvme0n1 | grep “LBA Format”

[6:5] : 0     Most significant 2 bits of Current LBA Format Selected

[3:0] : 0     Least significant 4 bits of Current LBA Format Selected

LBA Format  0 : Metadata Size: 0   bytes – Data Size: 512 bytes – Relative Performance: 0 Best (in use)

LBA Format  1 : Metadata Size: 8   bytes – Data Size: 512 bytes – Relative Performance: 0 Best

LBA Format  2 : Metadata Size: 0   bytes – Data Size: 4096 bytes – Relative Performance: 0 Best

LBA Format  3 : Metadata Size: 8   bytes – Data Size: 4096 bytes – Relative Performance: 0 Best

LBA Format  4 : Metadata Size: 64  bytes – Data Size: 4096 bytes – Relative Performance: 0 Best

Now we can format the drive, to 4k sector size. Make sure you do NOT select your boot drive. NVMe format will wipe the data on the drive instantly!!! There is 10 second warning now if you don’t use -f to force.

sudo nvme format -l 2 -f /dev/nvme0n1

Sanitize

Please the NVMe specification for an overview of Sanitize Operations (Optional).

According to the NVMe specification, “a sanitize operation alters all user data in the NVM subsystem such that recovery of any previous user data from any cache, the non-volatile media, or any Controller Memory Buffer is not possible.”

You can find the Sanitize command, log pages, and theory of operation in the NVMe base specification.

  • Goal is to make user data unrecoverable. All user data in NVM, PMR, CMB, cache, metadata, unallocated, or overprovisioned space
  • Issued as a background operation with log and status
  • Types of sanitize supported, read identify controller
    • Block erase – low level media specific block erase (e.g. NAND erase block)
    • Crypto erase – change media encryption key
    • Overwrite- overwrite with a fixed pattern
  • Sanitize log page for status and estimated times for each method
  • Send async notification upon completion

Find a NVMe SSD’s sanitize capabilities through Identify Controller command, to see what types it supports

nvme id-ctrl /dev/nvme0 -H | grep sanicap -A 5

sanicap   : 0x2

[31:30] : 0   Additional media modification after sanitize operation completes successfully is not defined

[29:29] : 0   No-Deallocate After Sanitize bit in Sanitize command Supported

[2:2] : 0   Overwrite Sanitize Operation Not Supported

[1:1] : 0x1 Block Erase Sanitize Operation Supported

[0:0] : 0   Crypto Erase Sanitize Operation Not Supported

Send Sanitize command with action -2, block erase

nvme sanitize –a 2 /dev/nvme0n1

Loop: Monitor Sanitize Status with Sanitize Log

nvme sanitize-log -H /dev/nvme0n1

Sanitize Progress                      (SPROG) :  40164 (61.285400%)

Sanitize Status                        (SSTAT) :  0x2

[2:0]   Sanitize in Progress.

Sanitize completes

nvme sanitize-log -H /dev/nvme0n1

Sanitize Progress                      (SPROG) :  65535

Sanitize Status                        (SSTAT) :  0x101

[2:0]   Most Recent Sanitize Command Completed Successfully.

NVMe Namespace Management

Steps to overprovision using NVMe cli, per namespace:

  1. Detach all namespaces from each controller (spec recommends detaching first but delete also works).

sudo nvme detach-ns /dev/nvme0 –namespace-id=1 –controllers=0

  1. Delete each namespace you’ve detached.

sudo nvme delete-ns /dev/nvme0 –namespace-id=1

  1. Create a new namespace at the desired capacity (repeat for each namespace). This example is going from 3.84TB to 3.2TB. This is used to take a 1 DWPD drive and turn it into a 3 DWPD drive with overprovisoning.

sudo nvme create-ns /dev/nvme0 –nsze-si 3.2T –ncap 3.2T –flbas 0 –dps 0 –nmic 0

  1. Attach new namespaces to desired controllers.

sudo nvme attach-ns /dev/nvme0 –namespace-id=1 –controllers=0

  1. Reset device to make the target visible to the host.

sudo nvme reset /dev/nvme0

Summary

NVMe-CLI is a very powerful tool for managing NVMe SSDs directly in Linux. All the information needed to understand the features and functionality is contained in the NVMe specs – do not feel scared to download a copy and open! I’ve highlighted the most common commands for managing NVMe SSDs but the tool also works for NVMe-oF architecture, which will be covered separately. NVMe technology has a robust set of management, logging, error reporting capabilities and NVMe-CLI is the way to unlock the value in Linux. NVMe-CLI is also a great way to start learning about the capabilities of NVMe in a hands on way – so download it and try it out for yourself!