Using File Storage Parallel Tools
The Parallel File Tools suite provides parallel versions of tar,
rm, and cp. These tools can run requests on large
file systems in parallel, maximizing performance for data protection operations.
The toolkit includes:
partar: Use this command to create and extract tarballs in parallel.Note
Thepartartool supports the extraction oftarfiles created in the GNU basictarPOSIX 1003.1-1990 format. Files created in other archive formats, such asPAX, are not supported.parrm: You can use this command to recursively remove a directory in parallel.parcp: Use this command to recursively copy a directory in parallel.
Installing the Parallel File Tools
The tool suite is distributed as an RPM for Oracle Linux, Red Hat Enterprise Linux, and CentOS.
To install Parallel File Tools on an Oracle Linux instance:
- Open a terminal window on the destination instance.
- Type the following
command:
sudo yum install -y fss-parallel-tools
To install Parallel File Tools on an Oracle Linux 8 instance:
- Open a terminal window on the destination instance.
- Install the Oracle Linux developer repository, if needed, by using the following
command:
dnf install oraclelinux-developer-release-el8 - Install the Parallel File Tools from the developer repository using the following
command:
dnf --enablerepo=ol8_developer install fss-parallel-tools
To install Parallel File Tools on CentOS and Red Hat 6.x:
- Open a terminal window on the destination instance.
- Type the following
command:
sudo wget http://yum.oracle.com/public-yum-ol6.repo -O /etc/yum.repos.d/public-yum-ol6.repo sudo wget http://yum.oracle.com/RPM-GPG-KEY-oracle-ol6 -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle sudo yum --enablerepo=ol6_developer install fss-parallel-tools
- Open a terminal window on the destination instance.
- Type the following
command:
sudo wget http://yum.oracle.com/public-yum-ol7.repo -O /etc/yum.repos.d/public-yum-ol7.repo sudo wget http://yum.oracle.com/RPM-GPG-KEY-oracle-ol7 -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle sudo yum --enablerepo=ol7_developer install fss-parallel-tools
Using the Tools - Basic Examples
Here are some simple examples of how the different tools are commonly used in Oracle Cloud Infrastructure File Storage.
In this example, parcp is used to copy the directory "folder" in /source to /destination. The -P option is used to set the number of parallel threads you want to use.
$parcp -P 16 /source/folder /destinationIn the following example, parcp is used to copy the contents of the directory "folder" in /source to /destination. The "folder" directory itself is not copied.
$parcp -P 16 /source/folder/. /destination.tar archive of the contents of the specified directory, and stores it as a tarball in the directory. In the example below, the name of the directory that is being used to create the tarball is example. $partar pcf example.tar example -P 16example. The tarball is being created in the /test directory.$partar pcf example.tar example -P 16 -C /testUsing the Tools - Advanced Examples
Here are some examples of how the different tools are used in more advanced scenarios.
You can specify which files and folders are included when you create a .tar archive using partar. Let's say you have a directory that looks like this:
[opc@example sourcedir]$ ls -l
total 180
-rw-r-----. 1 opc opc 0 Apr 15 02:55 example2020-04-15_02-55-33_217107549.error
-rw-r-----. 1 opc opc 10 Apr 15 03:18 example2020-04-15_02-55-33_217107549.log
-rw-rw-r--. 1 opc opc 12 Apr 15 03:18 example2020-04-15_03-18-13_267771997.error
-rw-rw-r--. 1 opc opc 10 Apr 15 03:18 example2020-04-15_03-18-13_267771997.log
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
-rwxr-xr-x. 1 opc opc 57 Dec 1 2017 File4.txt
The following command creates a .tar archive that:
- Contains a
mydirdirectory named as specified. - Includes
File1.txt,File2.txt,File3.txt, andFile4.txt. - Excludes all
.logand.errorfiles. - Sends the
.tarball from/sourcedirto/mnt/destinationdir - Extracts the
.tararchive
[opc@example sourcedir]$ sudo partar cf - mydir --exclude '*.log*' --exclude '*.err*' | sudo partar xf - -C /mnt/destinationdir
Performing ls -l on /mnt/destinationdir/mytar shows that only the desired files have been copied.
[opc@example mytar]$ ls -l
total 148
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
-rwxr-xr-x. 1 opc opc 57 Dec 1 2017 File4.txtWhen excluding a directory or file from the archive, provide only the name of the directory or file. The --exclude option does not support use of an absolute path. Using an absolute path in the --exclude option will not exclude the specified directory or files from the .tar archive. For example, if you need to exclude a directory called testing from the path of the source directory, you would specify that in a command like the following:
sudo partar pczf name_of_tar_file.tar.gz /<path_source_directory> --exclude=testing
All files or directories that match the
--exclude pattern under the path of the source directory will be excluded from the partar archive.You can specify which files and folders are included when you use parcp to copy from one directory to another. Let's say you have a directory that looks like this:
[opc@example sourcedir]$ ls -l
total 180
-rw-r-----. 1 opc opc 0 Apr 15 02:55 example2020-04-15_02-55-33_217107549.error
-rw-r-----. 1 opc opc 10 Apr 15 03:18 example2020-04-15_02-55-33_217107549.log
-rw-rw-r--. 1 opc opc 12 Apr 15 03:18 example2020-04-15_03-18-13_267771997.error
-rw-rw-r--. 1 opc opc 10 Apr 15 03:18 example2020-04-15_03-18-13_267771997.log
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txt
-rwxr-xr-x. 1 opc opc 57 Dec 1 2017 File4.txtFirst, create a .txt file containing a list of files you want to exclude. In this example, it's /home/opc/list.txt.
The following command copies the contents from sourcedir to /mnt/destinationdir and:
- Copies
File1.txt,File2.txt, andFile3.txt. - Excludes
File4.txtand the.logand.errorfiles, as listed in/home/opc/list.txt.
[opc@example ~]$ cat /home/opc/list.txt
File4.txt
*.log*
*.err*
[opc@example ~]$ date; time sudo parcp --exclude-from=/home/opc/list.txt -P 16 --restore /sourcedir /mnt/destinationdir;
date Mon Jun 1 15:58:30 GMT 2020
real 9m55.820s
user 0m3.602s
sys 1m5.441s
Mon Jun 1 16:08:25 GMT 2020ls -l on /mnt/destinationdir shows that only the desired files have been copied.[opc@example destinationdir]$ ls -l
total 91
-rwxr-xr-x. 1 opc opc 37 Nov 30 2017 File1.txt
-rwxr-xr-x. 1 opc opc 15 Dec 1 2017 File2.txt
-rwxr-xr-x. 1 opc opc 39 Nov 30 2017 File3.txtThe --restore option in parcp is similar to using the -a -r -x and -H options in rsync. (See rsync(1)- Linux Man Page.) The -P option is used to set the number of parallel threads you want to use.
The restore option includes the following behavior:
- Recurse into directories
- Stop at file system boundaries
- Preserve hard links, symlinks, permissions, modification times, group, owners, and special files such as
named socketsandfifofiles
$parcp -P 16 --restore /source/folder/ /destinationYou can use parcp with the --restore and --delete options to sync files between a source and target folder. This is a good substitute for using rsync in parallel. As files are added or removed from the source directory, you can run this command at regular intervals to add or remove the same files from the destination directory. You can automate syncing by using this command option in a cron job.
sudo parcp -P 32 --restore --delete /source/folder/ /destination