Regions, Atomic Sectors, and DAX
One of the few reasons to allow multiple BLK namespaces per REGION is so that each BLK-namespace can be configured with a Block Translation Table (BTT) with unique atomic sector sizes. While a PMEM device can host a BTT, the LABEL specification does not provide for a sector size to be specified for a PMEM namespace. This is due to the expectation that the primary usage model for PMEM is via DAX, and the BTT is incompatible with DAX. However, for the cases where an application or filesystem still needs atomic sector update guarantees, it can register a BTT on a PMEM device or partition.
Figure 1 shows an example platform with four NVDIMMs, two integrated memory controllers (IMC), and a single CPU .
(a) (b) DIMM BLK-REGION
+-------------------+--------+--------+--------+
+------+ | pm0.0 | blk2.0 | pm1.0 | blk2.1 | 0 region2
| imc0 +--+- - - region0- - - +--------+ +--------+
+--+---+ | pm0.0 | blk3.0 | pm1.0 | blk3.1 | 1 region3
| +-------------------+--------v v--------+
+--+---+ | |
| cpu0 | region1
+--+---+ | |
| +----------------------------^ ^--------+
+--+---+ | blk4.0 | pm1.0 | blk4.0 | 2 region4
| imc1 +--+----------------------------| +--------+
+------+ | blk5.0 | pm1.0 | blk5.0 | 3 region5
+----------------------------+--------+--------+
Figure 1: Example sysfs layout
Each unique interface (BLK or PMEM) to DPA space is identified by a region device with a dynamically assigned id (REGION0 - REGION5).
- 1.The first portion of DIMM0 and DIMM1 is interleaved as REGION0. Asingle PMEM namespace is created in the REGION0-SPA-range that spans mostof DIMM0 and DIMM1 with a user-specified name of "pm0.0". Some of thatinterleaved system-physical-address range is reclaimed as BLK-apertureaccessed space starting at DPA-offset (a) into each DIMM. In thatreclaimed space, we create two BLK-aperture "namespaces" from REGION2 andREGION3 where "blk2.0" and "blk3.0" are just human readable names thatcould be set to any user-desired name in the LABEL.
- 2.In the last portion of DIMM0 and DIMM1 we have an interleaved system-physical-address range, REGION1, that spans those two DIMMs as well as DIMM2 and DIMM3. Some of REGION1 is allocated to a PMEM namespace named "pm1.0", the rest is reclaimed in 4 BLK-aperture namespaces (for each DIMM in the interleave set), "blk2.1", "blk3.1", "blk4.0", and "blk5.0".
- 3.The portion of DIMM2 and DIMM3 that do not participate in the REGION1 interleaved system-physical-address range (i.e. the DPA address past offset (b) are also included in the "blk4.0" and "blk5.0" namespaces. Note, that this example shows that BLK-aperture namespaces don't need tobe contiguous in DPA-space. This bus is provided by the kernel under the device
/sys/devices/platform/nfit_test.0
when CONFIG_NFIT_TEST is enabled and the nfit_test.ko module is loaded. This not only test LIBNVDIMM but the acpi_nfit.ko driver as well.
Last modified 10mo ago