If there are leftovers __end_that_request_first will call
blk_recalc_rq_sector to set req->hard_nr_sectors correctly.
For failfast, by subtracting bytes from the updated
req->hard_nr_sectors and then calling end_that_request_chunk
and end_that_request_last, you can end up with a bio that
never gets completed.

The attached patch built against 2.6.4 does not subtract
bytes for non-blk_pc_requests.

On a related note I was wondering if the following is a
bug or feature. At the bottom of scsi_io_completion, it calls
scsi_end_request on the bytes for the current buffer.
The comments indicate this is to handle improperly reported
medium errors, but commands scsi_decide_disposition
determined should not be retried end up getting
requeued minus the current buffer.  [Mike Christie]
--- diff/drivers/scsi/scsi_lib.c	2004-03-16 09:37:57.396814528 +0000
+++ source/drivers/scsi/scsi_lib.c	2004-03-16 09:38:38.752527504 +0000
@@ -524,7 +524,7 @@ static struct scsi_cmnd *scsi_end_reques
 	 * to queue the remainder of them.
 	 */
 	if (end_that_request_chunk(req, uptodate, bytes)) {
-		int leftover = (req->hard_nr_sectors << 9) - bytes;
+		int leftover = req->hard_nr_sectors << 9;
 
 		if (blk_pc_request(req))
 			leftover = req->data_len - bytes;
