Fixed Issues

The following issues have been fixed in this update.

Severe NVMe storage performance degradation issue fixed

A bug that caused significant performance degradation when using NVMe devices in conjunction with earlier UEK R4 releases has been resolved. The issue caused sequential reads with abnormal request sizes that severely limited I/O regardless of block size. NVMe performance related patches were applied to the kernel and the issue has been resolved.
ocfs2: Slow journal replay fixed

A minor bug in the ocfs2 code caused journal replay for a dead node to take longer than necessary as all blocks from the dead node's journal inode were loaded from disk to memory to avoid a stale cache, however performance is enhanced if only the cached blocks are reloaded. A patch was applied to improve recovery performance.
ocfs2: Fixed issue releasing disk space after file deletion

An issue that caused ocfs2 to not release disk space after a large number of files were deleted has been fixed. The patches applied fixed the code that extended credits while there are free cached blocks and while flushing the truncate log.
ocfs2: Fixed issue with unknown option 'ExecRestart' in section 'Service' for the o2cb.service file
An invalid entry in the o2cb.service file that was included in the ocfs2-tools user space package had been resolved. The problem caused the following message to appear in the systemd status or in /var/log/messages:
```
systemd[1]: [/usr/lib/systemd/system/o2cb.service:11] Unknown lvalue
'ExecRestart' in section 'Service'
```
The problem is now resolved in ocfs2-tools-1.8.6-9.el7 and later.
Kernel panic during storage device reset when using the lpfc driver module

A bug that caused a kernel panic during a storage device reset was fixed in the lpfc driver module. The issue appeared during both when an sg_reset command was issued and when SCSI EH (Error Handling) triggered a reset. The patch fixes this issue.
Hyper-V clock source changed to use TSC

An upstream fix that changed the Hyper-V clock source to use the Time Stamp Counter (TSC), for greater efficiency in kernel operations that involve reading time stamps, has been been backported into this release.
Hyper-V storage driver performance improvements

Upstream updates to the storvsc Hyper-V storage driver were included to provide moderate performance improvement of I/O operations for certain workloads.
Hyper-V fix for guest reboot on failover issue

A bug that caused virtual machines running on Hyper-V to reboot during a graceful node failover, so that live migration was unsuccessful, has been fixed. This problem was caused as a result of a failure to check that all heartbeat and vmbus messages were correctly processed. A patch was applied and the problem is resolved.
Hyper-V fix for incorrect receive checksum offloading in the netsvc driver

A bug in the Hyper-V netvsc driver caused TCP packets with a bad receive checksum to be passed up the stack to the application layer, potentially causing data corruption. The fix included in this update causes packets with incorrect checksums to be dropped with an error.
Update for network bonding to fix primary_reselect with failure

The primary_reselect option used to define the reselection policy for the primary slave in a network bond would not behave correctly when set to failure or 2. The issue would result in the primary slave becoming active again when it had recovered. The expected, and documented, behavior is that the primary slave should not become active until the current active slave is down. A patch was applied to the bonding code to change the bond_find_best_slave functionality to avoid traversing members if the primary interface is not a candidate for failover or reselection and the current active slave is still up.
Fix for SCSI code that caused a kernel crash when a target node in an HA pair was rebooted

Several patches were applied to the scsi driver code to fix an issue that caused the kernel to crash when a target node in an HA pair was rebooted on a SAN booted LUN configured for multipath. The issue resulted when the SCSI target device was marked for removal and there was a delay in getting it into the DEL state. This could cause the same target to get marked for removal twice. The applied patches resolved the issue.
Fix applied for Mellanox® mlx4 driver to resolve the "Node crashed at cache_alloc_refill+0x1ab" error

A bug that allowed multiple work queues to be allocated to the same id_map_ent structure in the mlx4 driver code was patched. This issue could cause a kernel crash if a worker routine cleaned up and freed the structure. The patch checks that previously queued work on the structure has been successfully cancelled before new work is queued on the same structure.
timer code patched for race condition that could result in a kernel oops

A patch was applied to fix an issue in the timer code that resulted in a kernel oops when a timer was migrated to an alternate CPU and had been left unlocked on the original CPU. The fix performs a proper migration and does the appropriate checks and locks during the migration to prevent the race condition.
Race condition in freeing aging forwarding tables in xsigo driver fixed

A fix was applied to the xsigo driver code to detect and avoide a potential race condition while accessing the forwarding table during the deletion of an aged forwarding entry. This issue could cause nodes to reboot inadvertently.
Race condition issue in the optrom functions in the qla2xxx driver fixed

A race condition that triggered when a thread modified the optrom buffer, in the qla2xxx driver, at the same time that another thread attempted to read from it was patched in this update. This issue was fixed by getting a mutex lock before checking the optrom state. The problem could result in kernel panic and inadvertent system reboots.
netxen driver patched for incorrect error handling

The QLogic/NetXen (1/10) GbE Intelligent Ethernet Driver was patched to fix an error handling issue that prevented the netxen_rom_fast_read() function from ever returning -1. Additional vendor patches were also applied.
RDS patched to fix QoS threshold calculation

When the Reliable Datagram Sockets (RDS) protocol was placed under loads that caused it to drop packets, the qos_threshold_exceeded parameter was not incremented because the RDMA payloads were calculated incorrectly. This caused the Quality of Service (QoS) functionality to fail. A patch was applied to fix this calculation so that the QoS threshold could be enforced.
RDMA package updated to fix mlx4_ib insertion error when RDMA starts

A bug that caused a benign insertion error when the RDMA service started was fixed in the rdma-3.10-3.0.25 package. With this update release, the RDMA package is updated to rdma-3.10-3.0.31 to provide several further bug fixes and code improvements.
NFSv4 issue with client incrementing the lock sequence number on NFS4ERR_MOVED fixed

A change in the NFSv4 specification meant that when the NFS client connected to a server based on the newer specification and sent a lock to the source and got an NFS4ERR_MOVED response, if it resent the lock to the destination, it would generate a bad sequence ID error: NFS4ERR_BAD_SEQID. The UEK R4 NFS client adheres to RFC 3530, which does not cover NFS4ERR_MOVED. A patch was applied to better adhere to RFC 7530 and to prevent the client from incrementing the lock sequence ID after receiving an NFS4ERR_MOVED from the server.