D Downloading the Correct Versions of the Hadoop, Hive, and HBase Clients for a Local Repostory
If you choose to download these dependencies from a local repository, use these instructions to add the correction client versions to the repository.
By default, the Jaguar installer will attempt to download the clients from the Cloudera or HDP repositories on the Internet. If this is access is restricted within your data center, then you can specify a local repository. These can be local directories or NFS paths. They can also be URLs within the local network or on the Internet.
For CDH 5.x :
First, check the content management service (CM or Ambari) and find the version of the Hadoop, Hive, and HBase services running on the Hadoop cluster. The compatible clients are of the same versions. In each case, the client tarball filename includes a version string segment that matches the version of the service installed on the cluster. In the case of CDH, you can then browse the public repository and find the URL to the client that matches the service version.
-
Log on to Cloudera Manager and go to the Hosts menu. Select All Hosts , then Inspect All Hosts.
-
When the inspection is finished, select either Show Inspector Results (on the screen) or Download Result Data (to a JSON file).
-
In either case, scan the result set and find the service versions.
In JSON version of the inspector results, there is a
componentInfo
section for each cluster that shows the versions of software installed on that cluster. The format of the data set is as follows:"componentInfo": [ ... { "cdhVersion": "CDH5", "componentRelease": "1.cdh5.11.1.p0.6", "componentVersion": "2.6.0+cdh5.11.1+2400", "name": "hadoop" }, ...
-
Go to https://archive.cloudera.com/cdh5/cdh/5.
Note:
Since February 2021 all Cloudera repos require password authentication, you need to supply your Cloudera credential to access and download both client jars for cdh5 or client rpms for cdh6. If you are running Big Data Appliance please contact oracle support to request a patch with the specific clients you need.Look in the ”hadoop,” hive,” and “hbase” subdirectories of the CDH5 section of the archive. In the listings, you should find the client tarball packages for the versions of the services installed on the cluster, such as the following:
https://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.12.1.tar.gz https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.12.1.tar.gz https://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.12.1.tar.gz
After you identify the correct versions of the clients and download them to
the local repository, provide the path in the repositories
section of
the bds-config.json
file used by the Jaguar installer.
For CDH 6.X:
The dir
and url
parameters in the
bds-config.json
configuration file are not supported on Cloudera 6.x
systems. For CDH 6.x, set up a local repository prior to running
bds-database-install.sh
(the database-side) installer and include
the --alternate-repo
parameter on the installer command line as
described in the Command Line Parameter Reference for bds-database-install.sh.
For HDP:
-
Log on to Ambari. Go to Admin, then Stack and Versions. On the Stack tab, locate the entries for the HDFS, Hive, and HBase services and note down the version number of each as the “
service version
.” -
Click the Versions tab. Note down the version of HDP that is running on the cluster as the “
HDP version base
.” -
Click Show Details to display a pop-up window that shows the full version string for the installed HDP release. Note this down as the “
HDP full version
” -
The last piece of information needed is the Linux version (“centos5,” “centos6,” or “centos7”). Note this down as “
OS version
.”
To search though the HDP repository in Amazon S3 storage to find the correct client URLs using this information acquired in this steps, you would need an S3 browser, browser extension, or command line tool. As alternative, you can piece together the correct URLs, using these strings.
For HDP 2.5 and earlier, the URLs pattern is as follows.
http://public-repo-1.hortonworks.com/HDP/<OS version>/2.x/updates/<HDP version base>/tars/{hadoop|apache-hive|hbase}-<service version>.<HDP full version>.tar.gz
Here are some examples. Note that the pattern of the gzip filename is
slightly different for Hive. There is an extra “-bin
” segment in the
name.
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.2.0/tars/hadoop-2.7.1.2.3.2.0-2950.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.2.0/tars/apache-hive-1.2.1.2.3.2.0-2950-bin.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.2.0/tars/hbase-1.1.2.2.3.2.0-2950.tar.gz
hadoop
, hive
, or
hbase
directory under the tar
directory:
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.6.0/tars/hadoop/hadoop-2.7.3.2.5.6.0-40.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.6.0/tars/hive/apache-hive-1.2.1000.2.5.6.0-40-bin.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.6.0/tars/hbase/hbase-1.1.2.2.5.6.0-40.tar.gz
Alternative Method for HDP:
You can get the required software versions from the command line instead of using Ambari.
-
# hdp-select versions
Copy and save the numbers to the left of the dash as the “
HDP version base
”. -
Use the output from these commands to formulate the <# hadoop version # beeline --version # hbase version
service version
>.<HDP full version
> segment for each URL.