From: Andrew Cooper Date: Tue, 27 Nov 2018 17:21:17 +0000 Subject: IOMMU/VT-d: Correct the calculation of which IRTE to use This has been broken since c/s 1269742697 "VT-d: Allocates page table pgd, root_entry, iremap and qinval from domheap rather than xenheap" in Xen 3.3. The VT-d fields iremap_maddr and qinval_maddr point to the base of a block of contiguous RAM, allocated by the driver, holding the Interrupt Remapping table, and the Queued Invalidation ring. The logic to pick a specific entry depends on them being page aligned (as expected given their name), but the enable functions OR extra bits of control-register metadata into them. This ends up being ok in the qinval case, because QINVAL_PAGE_ORDER (value 2) is smaller than the size of a QINVAL entry (16 bytes), and the entry calculation doesn't overflow into the following page when referencing the entry at the end of the page. However for the intremap case, the metadata is IRTA_REG_TABLE_SIZE (value 0xf, again fine) and optionally IRTA_EIME (value 0x800). In practice, when using Extended Interrupt Mode, accesses to the second half of IRTEs in each page end up editing the wrong part of the interrupt remapping table. For accesses intending to target the final page of the remapping table, this error causes the accesses to actually hit the next sequential bit of RAM, whatever that happens to be. Nothing reads the metadata out of the *_maddr fields, so fix the issue by leaving the fields properly aligned and only including the metadata when writing to the IOMMU control registers. While not a security itself, for consistency fix the same logical bug when reading the IRTA register and following it to dump the Interrupt Remapping table. This is XSA-283 Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich Reviewed-by: Paul Durrant diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c index a0663ec..21d1b19 100644 --- a/xen/drivers/passthrough/vtd/intremap.c +++ b/xen/drivers/passthrough/vtd/intremap.c @@ -802,14 +802,15 @@ int enable_intremap(struct iommu *iommu, int eim) ir_ctrl->iremap_num = 0; } - /* set extended interrupt mode bit */ - ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; - spin_lock_irqsave(&iommu->register_lock, flags); - /* set size of the interrupt remapping table */ - ir_ctrl->iremap_maddr |= IRTA_REG_TABLE_SIZE; - dmar_writeq(iommu->reg, DMAR_IRTA_REG, ir_ctrl->iremap_maddr); + /* + * Set size of the interrupt remapping table and optionally Extended + * Interrupt Mode. + */ + dmar_writeq(iommu->reg, DMAR_IRTA_REG, + ir_ctrl->iremap_maddr | IRTA_REG_TABLE_SIZE | + (eim ? IRTA_EIME : 0)); /* set SIRTP */ gcmd = dmar_readl(iommu->reg, DMAR_GSTS_REG); diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c index e95dc54..01447cf 100644 --- a/xen/drivers/passthrough/vtd/qinval.c +++ b/xen/drivers/passthrough/vtd/qinval.c @@ -428,6 +428,8 @@ int enable_qinval(struct iommu *iommu) flush->context = flush_context_qi; flush->iotlb = flush_iotlb_qi; + spin_lock_irqsave(&iommu->register_lock, flags); + /* Setup Invalidation Queue Address(IQA) register with the * address of the page we just allocated. QS field at * bits[2:0] to indicate size of queue is one 4KB page. @@ -435,10 +437,8 @@ int enable_qinval(struct iommu *iommu) * registers are automatically reset to 0 with write * to IQA register. */ - qi_ctrl->qinval_maddr |= QINVAL_PAGE_ORDER; - - spin_lock_irqsave(&iommu->register_lock, flags); - dmar_writeq(iommu->reg, DMAR_IQA_REG, qi_ctrl->qinval_maddr); + dmar_writeq(iommu->reg, DMAR_IQA_REG, + qi_ctrl->qinval_maddr | QINVAL_PAGE_ORDER); dmar_writeq(iommu->reg, DMAR_IQT_REG, 0); diff --git a/xen/drivers/passthrough/vtd/utils.c b/xen/drivers/passthrough/vtd/utils.c index 85e0f41..94a6e4e 100644 --- a/xen/drivers/passthrough/vtd/utils.c +++ b/xen/drivers/passthrough/vtd/utils.c @@ -204,8 +204,9 @@ void vtd_dump_iommu_info(unsigned char key) if ( status & DMA_GSTS_IRES ) { /* Dump interrupt remapping table. */ - u64 iremap_maddr = dmar_readq(iommu->reg, DMAR_IRTA_REG); - int nr_entry = 1 << ((iremap_maddr & 0xF) + 1); + uint64_t irta = dmar_readq(iommu->reg, DMAR_IRTA_REG); + uint64_t iremap_maddr = irta & PAGE_MASK; + unsigned int nr_entry = 1 << ((irta & 0xF) + 1); struct iremap_entry *iremap_entries = NULL; int print_cnt = 0;