$OpenBSD: patch-hw_net_e1000_c,v 1.2 2016/01/22 07:00:15 ajacoutot Exp $

e1000: Avoid infinite loop in processing transmit descriptor (CVE-2015-6815)

While processing transmit descriptors, it could lead to an infinite
loop if 'bytes' was to become zero; Add a check to avoid it.

[The guest can force 'bytes' to 0 by setting the hdr_len and mss
descriptor fields to 0.

e1000: eliminate infinite loops on out-of-bounds transfer start

The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:

- the TDLEN and RDLEN registers store the total size of the descriptor
  area,

- while the TDH and RDH registers store the offset (in whole tx / rx
  descriptors) into the area where the transfer is supposed to start.

Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).

QEMU already contains logic to deal with bogus transfers submitted by the
guest:

- Normally, the transmit case wants to increase TDH from its initial value
  to TDT. (TDT is allowed to be numerically smaller than the initial TDH
  value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
  that QEMU currently has here is a check against reaching the original
  TDH value again -- a complete wraparound, which should never happen.

- In the receive case RDH is increased from its initial value until
  "total_size" bytes have been received; preferably in a single step, or
  in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
  RX descriptors are skipped without receiving data, while RDH is
  incremented just the same. QEMU tries to prevent an infinite loop
  (processing only null RX descriptors) by detecting whether RDH assumes
  its original value during the loop. (Again, wrapping from RDLEN to 0 is
  normal.)

What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.

The condition that expresses this is:

  xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)

i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.

This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.

CVE-2016-1981

--- hw/net/e1000.c.orig	Tue Mar 10 13:38:27 2015
+++ hw/net/e1000.c	Wed Jan 20 21:44:11 2016
@@ -736,7 +736,8 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *d
                 memmove(tp->data, tp->header, tp->hdr_len);
                 tp->size = tp->hdr_len;
             }
-        } while (split_size -= bytes);
+            split_size -= bytes;
+        } while (bytes && split_size);
     } else if (!tp->tse && tp->cptse) {
         // context descriptor TSE is not set, while data descriptor TSE is set
         DBGOUT(TXERR, "TCP segmentation error\n");
@@ -814,7 +815,8 @@ start_xmit(E1000State *s)
          * bogus values to TDT/TDLEN.
          * there's nothing too intelligent we could do about this.
          */
-        if (s->mac_reg[TDH] == tdh_start) {
+        if (s->mac_reg[TDH] == tdh_start ||
+            tdh_start >= s->mac_reg[TDLEN] / sizeof(desc)) {
             DBGOUT(TXERR, "TDH wraparound @%x, TDT %x, TDLEN %x\n",
                    tdh_start, s->mac_reg[TDT], s->mac_reg[TDLEN]);
             break;
@@ -1058,7 +1060,8 @@ e1000_receive_iov(NetClientState *nc, const struct iov
         if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN])
             s->mac_reg[RDH] = 0;
         /* see comment in start_xmit; same here */
-        if (s->mac_reg[RDH] == rdh_start) {
+        if (s->mac_reg[RDH] == rdh_start ||
+            rdh_start >= s->mac_reg[RDLEN] / sizeof(desc)) {
             DBGOUT(RXERR, "RDH wraparound @%x, RDT %x, RDLEN %x\n",
                    rdh_start, s->mac_reg[RDT], s->mac_reg[RDLEN]);
             set_ics(s, 0, E1000_ICS_RXO);
