Jean-François Marie, Chief Solution Architect, Kalray
Today, use cases and workloads in modern data centers are increasing, driving the need for storage and compute disaggregation and a move to hyperconverged infrastructure (HCI). This HCI approach often leads to the siloing of storage, creating a capacity waste and scalability issue. Recently, I had the pleasure of hosting the final NVM Express webcast of 2020, in which I introduced this concept and discussed how Smart Storage adapters can leverage NVMe® over Fabrics (NVMe-oF™) technology for a new HCI approach and achieve a true composable infrastructure.
During the live webcast, I discussed the data processing unit revolution, future data center infrastructure challenges and use cases such as Fabric attached Bunch of Flash (FBOF), Disaggregated Composable Infrastructure, HCI with NVMe-oF technology, a new HCI topology with Smart Storage Adapters and PCIe Peer-to-peer NVMe technology emulation.
I wasn’t able to answer every audience question we received, so I have provided the answers below:
Q: Does this solution architecture require a parallel NVMe architecture that is connected? This is parallel to the common TCP-IP stack being used by x86. It is not shared with x86 tcpip stack.
A: You may use a TCP/IP network, but you need the Smart Storage Adapter to be directly connected to that network. If you have a NIC which is managed by the x86, then this is a separated network.
Q: Does this require a switch to connect to the NVMe drives for Peer2Peer transfers or are you using the x86 RC to facilitate the Peer2Peer transfers?
A: PCIe® switch, if available, can be used but is not mandatory. Using a PCIe switch for Peer2Peer transfers requires disabling ACS, which is not recommended on a server running VMs. (ACS stands for Access Control Services and is used to control which devices can communicate with one another, and thus avoid improper routing of packets).
Currently, we are using x86 RC to facilitate the P2P transfers, relying on vfio-pci to configure IOMMU and remoting PCIe config accesses. This way, even if configured in PCIe EP mode, the Smart Storage Adapter can fully take control over the server’s NVMe drives.
Q: Will SmartNIC be able to raid across SAS drives?
A: Yes. RAID will be supported by whatever backend storage.
Q: Is the original NVMe SSD namespace going to be removed or will that namespace still be available? How do you prevent simultaneous access from a P2P path and the direct x86 path?
A: When running a P2P path, the NVMe SSDs are fully controlled by the Smart Storage Adapter (see above question). The x86 doesn’t have direct access to it.
An NVMe ‘volume’ (it can be a physical NVMe SSD namespace or a logical volume corresponding for instance to a RAID group) is exposed to x86 by the Smart Storage Adapter on PCIe architecture via the ‘NVMe emulation’ feature.
Q: Will the card support Zoned Namespaces (ZNS)?
A: Yes, this is part of our plan to support ZNS especially for large capacity SSD such as QLC and soon PLC.
Q: Amazon talks about handling storage services using their Nitro SmartNIC. There are similar claims made by other SmartNIC vendors as well. Are you primarily an NVMe-oF target node that focuses on storage services? How do you see Data Centers and Enterprises picking between these various devices like SmartNIC, DPU, IPU etc.?
A: The Smart Storage Adapter can be configured either in the NVMe-oF target controller or NVMe-oF initiator. We are also developing a SmartNIC product to offload NFV such as vSwtich offload, or IPsec/TLS gateway.
From a Data Center perspective, today there are a variety of choices, all depending on the performance. If you want to manage 10GbE or 25GbE systems, then Smart Adaptors have enough power to offload several services such as storage, network, compute, and security on the same device and inline. This will be massively embraced at the Edge. Now, when it comes to handling 100 GbE or more, a lot more power is necessary to handle the network itself, leaving less room for more services. It is the same when it comes to feeding GPU or high power compute nodes. There is a need to provide an efficient transport layer and there are some great ideas with PCI or Ethernet disaggregation enabling composable architecture. However, we are at the edge of innovation, because for this to be sustainable, you need to have PCIe 4.0 or 5.0 architecture, not yet defined, 200 GbE or more. We are just at the beginning of this journey.
Q: It is very important to use smart adapters and drivers to reduce latency and improve agility. In the meantime, in real-time computing and interaction, how can we guarantee the security of drivers? For complex safety guarantees, we also need to improve the agility for security.
A: Given we are NVMe specification compliant on the PCIe interface, legacy OSes NVMe drivers can be used. The security of the drivers is under OS provider responsibility. Also, the SR-IOV feature and the corresponding isolation features inside our MPPA are here to guarantee multi-tenant isolation and security.
NVM Express Looks Forward to a Continued Webinar Series in 2021
Thank you to those who engaged with my presentation and for submitting the above questions. If you missed the live webcast, I invite you to watch the full webcast recording on the NVM Express YouTube channel and it share with your friends and colleagues.
I wish the storage community a happy holiday season and New Year. As we enter 2021, the NVM Express community looks forward to supporting and implementing further infrastructure enhancements with NVMe and NVMe-oF technologies.