Known Issues

btrfs, ext4 and xfs: Kernel panic when freeze and unfreeze operations are performed in multiple threads

Freeze and unfreeze operations performed across multiple threads on any supported file system can cause the system to hang and the kernel to panic. This is the result of a race condition that occurs when the unfreeze operation is triggered before it is actually frozen. The resulting unlock operation attempts a write operation on a non-existent lock resulting in the kernel panic. (Bug ID 25321899)

btrfs

The btrfs filesystem balance command does not warn that the RAID level can be changed under certain circumstances, and does not provide the choice of cancelling the operation. (Bug ID 16472824)
The copy-on-write nature of btrfs means that every operation on the file system initially requires disk space. It is possible that you cannot execute any operation on a disk that has no space left; even removing a file might not be possible. In the case that there is no space to store metadata an ENOSPC error is returned. In this situation, run sync before retrying an operation, as this can clear a background writeback that may be reserving metadata space. Another potential workaround is to add a disk or a file backed loop device using the btrfs device add command. The mechanism used to store data and metadata may lead to some confusion on the information returned by tools like df. Sometimes, metadata may fill all of the disk space allocated for this purpose, even while there is still space available for data. In this case, the file system is unbalanced and the problem can be resolved by performing a btrfs fi balance operation. See https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space for more information.
When you overwrite data in a file, starting somewhere in the middle of the file, the overwritten space is counted twice in the space usage numbers that btrfs qgroup show displays. Using the btrfs quota rescan does not help fix this issue either. (Bug ID 16609467)
If you use the -s option to specify a sector size to mkfs.btrfs that is different from the page size, the created file system cannot be mounted. By default, the sector size is set to be the same as the page size. (Bug ID 17087232)
The btrfs-progs and btrfs-progs-devel packages for use with UEK R4 are made available in the ol6_x86_64_UEKR4 and ol7_x86_64_UEKR4 ULN channels and the ol6_UEKR4 and ol7_UEKR4 channels on the Oracle Linux Yum Server. In UEK R3, these packages were made available in the ol6_x86_64_latest and ol7_x86_64_latest ULN channels and the ol6_latest and ol7_latest channels on the Oracle Linux Yum Server.

ext4

System hangs when processing corrupted orphaned inode list

If the orphaned inode list is corrupted the inode may be processed repeatedly resulting in a system hang. For example, if the orphaned inode list contains a reference to the bootloader inode, ext4_iget() returns a bad inode resulting in the processing loop that can hang the system. (Bug ID 24433290)
System hangs on unmount after an append to a file with negative i_size

While it is invalid for a file system to load an inode with a negative i_size, it is possible to create a file like this and append to it. This causes an integer overflow in the routines underlying writeback, which results in the kernel locking up. (Bug ID 25565527)
A hang occurs with the ext4 file system during the dynamic expansion of inode size when using the inode's i_extra_size field. (Bug ID 25718971)

xfs

Directory readahead completions can hang the system after unmount

Directory readahead can hang the system if the file system is unmounted suddenly after mount. If a directory readahead is delayed for long enough, buffer I/O completion may occur after the unmount has completed. The asynchronous nature of directory readahead I/O means that when the readahead I/O completion occurs, core data structures may have been freed, causing completion to run into invalid memory accesses. This can result in a kernel panic and system hang. (Bug ID 25550712)

Invalid corrupted file system error resulting from a problem with log recovery on v5 superblocks

A problem with log recovery on v5 superblocks that causes the metadata LSN not to update for buffers that it writes out, can result in a corruption error.

[1044224.901444] XFS (sdc1): Metadata corruption detected at
xfs_dir3_block_write_verify+0xfd/0x110 [xfs], block 0x1004e90
[1044224.901446] XFS (sdc1): Unmount and run xfs_repair
...
[1044224.901460] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1249
of file fs/xfs/xfs_buf.c.  Return address = 0xffffffffa07a8910
[1044224.901462] XFS (sdc1): Corruption of in-memory data detected.  Shutting
down filesystem
[1044224.901463] XFS (sdc1): Please umount the filesystem and rectify the
problem(s)
[1044224.904207] XFS (sdc1): log mount/recovery failed: error -117
[1044224.904456] XFS (sdc1): log mount failed"

The problem is that the log attempts to replay a buffer update that is no longer valid due to subsequent replayed updates. This results in a corruption error when, in fact, the file system is fine. (Bug ID 25380003)

System hangs on unmount after a buffered append to a file with negative i_size

While it is invalid for a file system to load an inode with a negative i_size, it is possible to create a file like this and in the case where a buffer appends to it, an integer overflow in the routines underlying writeback, result in the kernel locking up. A direct append does not cause this behavior. (Bug ID 25565490)
System hangs during xfs_fsr on two-extent files with speculative preallocation

During an xfs_fsr process on extents that are generated by speculative preallocation, the code that determines whether all the extents fit inline miscalculates because the di_nextents call that is used does not account for these extents. This results in corruption of the in-memory inode and ultimately the code attempts to move memory structures using incorrectly calculated ranges. This causes a kernel panic. (Bug ID 25333211)
XFS quotas are disabled after a read-only remount on Oracle Linux 6

Quotas are disabled on XFS if the file system is remounted with read-only permissions on Oracle Linux 6. (Bug ID 22908906)
Overlay file system is unable to mount on XFS where there is no d_type support

Overlay file systems rely on a feature known as d_type support. This feature is a field within a data structure that provides some metadata about files in a directory entry within the base file system. Overlay file systems use this field to track many file operations such as file ownership changes and whiteouts. d_type support can be enabled in XFS when the file system is created, by using the -n ftype=1 option. When d_type support is not enabled, an overlay file system may become corrupt and behave in unexpected ways. For this reason, this update release of UEK R4 prevents the mounting of an overlay file system on an XFS base where d_type support is not enabled.

Since the root partition on Oracle Linux is automatically formatted with -n ftype=0 where XFS is selected as the file system, for backward compatibility reasons, if you have overlay file systems in place already and these are not hosted on alternate storage, you must migrate these to a file system that is formatted with d_type support enabled.

To check that the XFS file system is formatted correctly:
```
# xfs_info /dev/sdb1 |grep ftype
```
Replace /dev/sdb1 with the path to the correct storage device. If the information returned by this command includes ftype=0, you must migrate the overlay data held in this directory to storage that is formatted correctly.

To correctly format a new block device with the XFS file system with support for overlay file systems, do:
```
# mkfs -t xfs -n ftype=1 /dev/sdb1
```
Replace /dev/sdb1 with the path to the correct storage device. It is essential that you use the -n ftype=1 option when you create the file system.

If you do not have additional block storage available, it is possible to create an XFS file system image and loopback mount this. For example, to create a 5 GB image file in the root directory, you could use the following command:
```
# mkfs.xfs -d file=1,name=/OverlayStorage,size=5g -n ftype=1
```
To temporarily mount this file, you can enter:
```
# mount -o loop -t xfs /OverlayStorage /mnt
```
An entry in /etc/fstab, to make a permanent mount for this storage, may look similar to the following:
```
/OverlayStorage    /mnt        xfs     loop            0 0 
```
This configuration can help as a temporary solution to solve upgrade issues. However, using a loopback mounted file system image as a form of permanent storage is not recommended for production environments. (Bug ID 26165630)

Docker

Running yum install within a container on an overlayfs file system can fail with the following error:
```
Rpmdb checksum is invalid: dCDPT(pkg checksums): package_name
```
This error can break Dockerfile builds but is expected behavior from the kernel and is a known issue upstream (see https://github.com/docker/docker/issues/10180.)

The workaround is to run touch /var/lib/rpm/* before installing the package.

Note that this issue is fixed in any Oracle Linux images available on the Docker Hub or Oracle Container Registry, but the issue could still be encountered when running any container based on a third-party image. (Bug ID 21804564)
Docker can fail where it uses the overlay2 storage driver on XFS-formatted storage

A kernel patch has been applied to prevent overlay mounts on XFS if the ftype is not set to 1. This fix resolves an issue where XFS did not properly support the whiteout features of an overlay filesystem if d_type support was not enabled. If the Docker Engine is already using XFS-formatted storage with the overlay2 storage driver, an upgrade of the kernel can cause Docker to fail if the underlying XFS file system is not created with the -n ftype=1 option enabled. The root partition on Oracle Linux 7 is automatically formatted with -n ftype=0 where XFS is selected as the file system. Therefore, if you intend to use the overlay2 storage driver in this environment, you must format a separate device for this purpose. (Bug ID 25995797)
Docker can fail where it uses the overlay2 storage driver and SELinux is enabled

If the Docker Engine is configured to use the overlay2 storage driver and SELinux is enabled and set to Enforcing mode, Docker containers are unable to function properly and permissions errors are encountered. If you intend to use Docker with the overlay2 storage driver, you must set SELinux to Permissive mode. (Bug ID 25684456)

DIF/DIX is not supported for ext file systems

The Data Integrity Field (DIF) and Data Integrity Extension (DIX) features that have been added to the SCSI standard are dependent on a file system that is capable of correctly handling attempts by the memory management system to change data in the buffer while it is queued for a write.

The ext2, ext3 and ext4 file system drivers do not prevent pages from being modified during I/O which can cause checksum failures and a "Logical block guard check failed" error. Other file systems such as XFS are supported. (Bug ID 24361968)

DTrace

Argument declarations with USDT probe definitions cannot be declared with derived types such as enum, struct, or union.
The following compiler warning can be ignored for USDT probe definition arguments of type string (which is a D type but not a C type):
```
provider_def.h:line#: warning: parameter names (without types) in function declaration
```
Multi-threaded processes under ustack(), usym(), uaddr() and umod() which perform dlopen() in threads other than the first thread may not have accurate symbol resolution for symbols introduced by the dlopen(). (Bug ID 20045149)

LXC

The lxc-net service does not always start immediately after installation on Oracle Linux 6

The lxc-net service does not always start immediately after installation on Oracle Linux 6, even though this action is specified as part of the RPM post-installation script. This can prevent the lxcbr0 interface from coming up. If this interface is not up after installation, you can manually start it by running service lxc-net start. (Bug ID 23177405)
LXC read-only ip_local_port_range parameter

With lxc-1.1 or later and UEK R4, ip_local_port_range is a read-writable parameter under /proc/sys/net/ipv4 in an Oracle Linux container rather than being read-only. (Bug ID 21880467)

Console Appears to Hang when Booting

When booting Oracle Linux 6 on hardware with an ASPEED graphics controller, the console may appear to hang during the boot process after starting udev. However, the system does boot properly and is accessible. The workaround is to add nomodeset as a kernel boot parameter in /etc/grub.conf. (Bug ID 22389972)

OFED iSER target login fails from an initiator on Oracle Linux 6

An Oracle Linux 6 system with the oracle-ofed-release packages installed and an iSER (iSCSI Extensions for RDMA) target configured, fails to login to the iSER target as an initiator. On the Oracle Linux 6 initiator machine, the following behavior is typical:

# iscsiadm -m node -T iqn.iser-target.t1 -p 10.196.100.134 --login
Logging in to [iface: default, target: iqn.iser-target.t1, portal:
10.196.100.134,3260] (multiple)
iscsiadm: Could not login to [iface: default, target: iqn.iser-target.t1,
portal: 10.196.100.134,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals

This is expected behavior resulting from an errata fix for CVE-2016-4564, to protect against a write from an invalid context.

(Bug ID 23615903)

Open File Description (OFD) locks are not supported on NFSv4 mounts

NFS is not designed to handle OFD locking. (Bug ID 22948696).

Shared Receive Queue (SRQ) is an experimental feature for Reliable Datagram Sockets (RDS) and is disabled by default

The SRQ function that optimizes resource usage within the rds_rdma module is experimental and is disabled by default. A warning message is displayed when you enable this feature by setting the rds_ib_srq_enabled flag. (Bug ID 23523586).

Unloading or removing the `rds_rdma` module is unsupported

Once the rds_rdma module has been loaded, you cannot remove the module using either rmmod or modprobe -r. Unloading of the rds_rdma module is unsupported and can trigger a kernel panic. Do not set the module_unload_allowed flag for this module. (Bug ID 23580850).

Increased dom0 memory requirement when using Mellanox® HCAs on Oracle VM Server

Oracle VM Servers running UEKR4u2 and upward in dom0 require at least 400MB more memory to use the Mellanox® drivers. This is because the default size of the SRQ count was increased from 64K to 256K in later versions of the kernel and the scale_profile option is now enabled by default in the mlx_core module.

In the case where "Out of memory" errors are observed in dom0, the maximum dom0 memory size should be increased. Alternative workarounds may involve manually setting the module parameters for the mlx4_core driver. To do this, edit /etc/modprobe.d/mlx4_core.conf and set scale_profile to 0. Alternately, set log_num_srq to 16. The preferred resolution to this issue is to increase the memory allocated to dom0 on Oracle VM Server. (Bug ID 23581534)

SDP performance degradation

The Sockets Direct Protocol (SDP), which was designed to provide an RDMA alternative to TCP over InfiniBand networks, is known to suffer from performance degradation on more recent kernels such as UEK R4 U2 and later. There is no active development on this protocol.

Although the library for this protocol is still available for this kernel, support is limited. You should consider using TCP on top of IP over InfiniBand as a more stable alternative. (Bug ID 22354885)

DHCP fails for KVM guest on a host using `i40e` driver module for its network interface card

The i40e driver module does not function correctly when bridged with a virtio network interface that makes a DHCP request from within a KVM guest. Although the DHCP request is sent through the virtio network interface, the request does not reach the network beyond the i40e network interface card. This regression is the result of a patch that was applied to enable VSI broadcast in promiscuous mode instead of adding a broadcast filter. The fix is under investigation. (Bug ID 25825419)

Hyper-V fcopy process fails when copying large files from host to guest

Oracle Linux guests, running UEK R4 and hosted on Microsoft® Windows 2012 R2 Hyper-V systems, can have trouble completing a file copy when using the Guest services facility, along with the hypervfcopyd service, to copy a large file from the host to the guest virtual machine. This issue can be seen when copying a file larger than 10GB from the host to the guest mount point and results in an error message appearing in the host log, similar to the following:

Copy-VMFile : The operation cannot be performed while the virtual machine is
in its current state. The name of the virtual machine is TestVM and its ID is 
9ebdc189-439a-4db2-b33f-05f3b07726bf.
At line:1 char:1 + Copy-VMFile "TestVM" -SourcePath "E:\ISO\largefile.7z" 
-DestinationPath "/mnt" -C ...
...
    + CategoryInfo          : InvalidResult: (:) [Copy-VMFile],
VirtualizationOperationFailedException
    + FullyQualifiedErrorId :
InvalidState,Microsoft.HyperV.PowerShell.Commands.CopyVMFileCommand

The issue can be reproduced across a variety of file system types. The issue is intermittent and does not appear during every copy attempt. (Bug ID 25866707 and 25866691)