Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 23 Apr 2018 16:54:53 +0400
From: Igor Stoppa <>
	Igor Stoppa <>
Subject: [PATCH 4/9] Documentation for Pmalloc

Detailed documentation about the protectable memory allocator.

Signed-off-by: Igor Stoppa <>
 Documentation/core-api/index.rst   |   1 +
 Documentation/core-api/pmalloc.rst | 161 +++++++++++++++++++++++++++++++++++++
 2 files changed, 162 insertions(+)
 create mode 100644 Documentation/core-api/pmalloc.rst

diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index c670a8031786..8f5de42d6571 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -25,6 +25,7 @@ Core utilities
+   pmalloc
 Interfaces for kernel debugging
diff --git a/Documentation/core-api/pmalloc.rst b/Documentation/core-api/pmalloc.rst
new file mode 100644
index 000000000000..27eb7b3eafc4
--- /dev/null
+++ b/Documentation/core-api/pmalloc.rst
@@ -0,0 +1,161 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. _pmalloc:
+Protectable memory allocator
+The pmalloc library is meant to provide read-only status to data that,
+for some reason, could neither be declared as constant, nor could it take
+advantage of the qualifier __ro_after_init, but it is in spirit
+At some point it might get teared down, however that doesn't affect how it
+is treated, while it's still relevant.
+Pmalloc protects data from both accidental and malicious overwrites.
+Example: A policy that is loaded from userspace.
+The MMU available in the system can be used to write protect memory pages.
+Unfortunately this feature cannot be used as-it-is, to protect sensitive
+data, because this potentially read-only data is typically interleaved
+with other data, which must stay writeable.
+pmalloc introduces the concept of protectable memory pools.
+A pool contains a list of areas of virtually contiguous pages of
+memory. When memory is requested from a pool, the requests are satisfied
+by reserving adequate amounts of memory from the active area of memory in
+that pool. A request can cross page boundaries, therefore an area is the
+minimum granularity that pmalloc allows to protect.
+There might be special cases where an area contains only one page, but
+they are still addressed as areas.
+Areas are allocated on-the-fly, when the space available is insufficient
+for satisfying the latest request received.
+To facilitate the conversion of existing code to pmalloc pools, several
+helper functions are provided, mirroring their k/vmalloc counterparts.
+However, there is no pfree(), because the memory protected by a pool is
+released exclusively when the pool is destroyed.
+When to use pmalloc
+- Pmalloc memory is intended to complement __ro_after_init.
+  __ro_after_init requires that the initialization value is applied before
+  init is completed. If this is not possible, then pmalloc can be used.
+- Pmalloc can be useful also when the amount of data to protect is not
+  known at compile time and the memory can only be allocated dynamically.
+- Finally, it can be useful also when it is desirable to control
+  dynamically (for example throguh the kernel command line) if some
+  specific data ought to be protected or not, without having to rebuild
+  the kernel, for toggling a "const" qualifier.
+  This can be used, for example, by a linux distro, to create a more
+  versatile binary kernel and allow its users to toggle between developer
+  (unprotected) or production (protected) modes by reconfiguring the
+  bootloader.
+When *not* to use pmalloc
+Using pmalloc is not a good idea when optimizing TLB utilization is
+paramount: pmalloc relies on virtual memory areas and will therefore use
+more TLB entries. It still does a better job of it, compared to invoking
+vmalloc for each allocation, but it is undeniably less optimized wrt to
+TLB use than using the physmap directly, through kmalloc or similar.
+- When a pool is protected, whatever memory would be still available in
+  the current vmap_area (from which allocations are performed) is
+  relinquished.
+- As already explained, freeing of memory is not supported. Pages will be
+  returned to the system upon destruction of the memory pool that they
+  belong to. For this reason, no pfree() function is provided
+- The address range available for vmalloc (and thus for pmalloc too) is
+  limited, on 32-bit systems. However it shouldn't be an issue, since not
+  much data is expected to be dynamically allocated and turned into
+  read-only.
+- Regarding SMP systems, the allocations are expected to happen mostly
+  during an initial transient, after which there should be no more need
+  to perform cross-processor synchronizations of page tables.
+  Loading of kernel modules is an exception to this, but it's not expected
+  to happen with such high frequency to become a problem.
+- While pmalloc memory can be protected, since it is allocated dynamically,
+  it is still subject to indirect attacks, where the memory itself is not
+  touched, but anything used as reference to the allocation can be altered.
+  In some cases the allocation from a pmalloc pool is referred to by another
+  allocation, from either the same or another pool, however at some point,
+  there will be a base reference which can be attacked, if it cannot be
+  protected.
+  This base reference, or "anchor" is suitable for protection using
+  __ro_after_init, since it only needs to store the *address* of the
+  pmalloc allocation that will be initialized and protected later on.
+  But the allocation can take place during init, and its address is known
+  and constant.
+Typical sequence, when using pmalloc
+Steps to perforn during init:
+#. create an "anchor", with the modifier __ro_after_init
+#. create a pool
+   :c:func:`pmalloc_create_pool`
+#. issue an allocation requests to the pool with either
+   :c:func:`pmalloc`
+   or one of its variants, like
+   :c:func:`pzalloc`
+   assigning its address to the anchor
+#. iterate the previous points as needed
+The Following steps can be performed at any time, both during and after
+init, as long as they strictly come after the previous sequence.
+#. initialize with the desired value the memory obtained from the pool(s)
+#. write-protect the memory so far allocated
+   :c::func:`pmalloc_protect_pool`
+#. iterate over the last 2 points as needed
+#. [optional] destroy the pool
+   :c:func:`pmalloc_destroy_pool`
+.. kernel-doc:: include/linux/pmalloc.h
+.. kernel-doc:: mm/pmalloc.c

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.