Build Cloudera parcels offline repo – without internet connection

By | August 22, 2014
We create directory where we want to save our parcels.


# mdkir -p /share/cdh_repo
# cd /share/cdh_repo

(Note: This can be any directory)

1) Download CDH parcel to the repo directory:


#pwd
/share/cdh_repo
#wget http://archive-primary.cloudera.com/cdh5/parcels/latest/CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel

2) Download the manifest.json file from the URL:


#pwd
/share/cdh_repo
#wget http://archive-primary.cloudera.com/cdh5/parcels/latest/manifest.json

3) Copy the hash code:


#pwd
/share/cdh_repo

Open the manifest.json file from the URL:
http://archive-primary.cloudera.com/cdh5/parcels/latest/manifest.json
Find the section of the manifest that corresponds to the parcel you downloaded.
For example, if you are running CentOS 6 and copied the parcel file CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel, then you would look for the section:

"parcelName": "CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel",
"components": [
....
],
"replaces": "IMPALA, SOLR, SPARK",
"hash": "67fc4c86b260eeba15c339f1ec6be3b59b4ebe30"
},

Create an SHA file containing the hash code. For example:

# echo "67fc4c86b260eeba15c339f1ec6be3b59b4ebe30" > CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel.sha</xmp</pre>
<p><strong>4) Create a server location to be accessible by CM as a URL:</strong><br />
(You can use any other server. Ex: Apache)</p>
<pre><xmp>
# cd /share
# nohup python -m SimpleHTTPServer 9080 &amp;
# nohup: ignoring input and appending output to `nohup.out'
Press Enter

Now the folder "/share/cdh_repo" contains the below files/directories in it.


# ls -1
CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel
CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel.sha
manifest.json

This is our repo address now:
http://ip-address-of-the-machine:9080/cdh_repo/

5) Go to Cloudera Manager
Administration > Settings > Parcels > Remote Parcel Repository URLs
Provide our server address, where we just created a repo:
http://ip-address-of-the-machine:9080/cdh_repo/

Save and Click on 'Check for New Parcels'

You have your offline repo 🙂


Going one step ahead, now if we want other services like Spark to be able from the same URL.
1) Download the parcel file:


#pwd
/share/cdh_repo
#wget http://archive-primary.cloudera.com/spark/parcels/latest/SPARK-0.9.0-1.cdh4.6.0.p0.98-el6.parcel

2) Create an SHA file containing the hash code. For example:


#pwd
/share/cdh_repo
#echo "dd9b61b3ef24b5c1e8ecee75f3123924796d660d" > SPARK-0.9.0-1.cdh4.6.0.p0.98-el6.parcel.sha

3) Now, in one directory there can be only one manifest.json file.
To add Spark, we have to append manifest details to the existing manifest.json we already have.


#pwd
/share/cdh_repo

Open the manifest.json from the URL: http://archive-primary.cloudera.com/spark/parcels/latest/manifest.json

Search for the parcel name we just downloaded Ex: SPARK-0.9.0-1.cdh4.6.0.p0.98-el6.parcel
and copy the corresponding details. ( Highlighted text below is the newly added content)


{
"lastUpdated": 14054557850000,
"parcels": [
{
"parcelName": "CDH-5.1.0-1.cdh5.1.0.p0.53-precise.parcel",
"components": [

]
"replaces": "IMPALA, SOLR, SPARK",
"hash": "767cf5b995c38ea467270373dee09f471194b393"
}
,
{
"parcelName": "SPARK-0.9.0-1.cdh4.6.0.p0.98-el6.parcel",
"components": [
{ "name": "spark",
"version": "0.9.0-cdh4.6.0",
"pkg_version": "0.9.0",
"pkg_release": "1.cdh4.6.0.p0.46"
}
],
"depends": "CDH (>= 4.4), CDH (<< 5.0)",
"hash": "dd9b61b3ef24b5c1e8ecee75f3123924796d660d"
}

] }

4) Now we have CDH and Spark in our repo


#pwd
/share/cdh_repo

# ls -1
CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel
CDH-5.1.0-1.cdh5.1.0.p0.53-el6.parcel.sha
manifest.json
SPARK-0.9.0-1.cdh4.6.0.p0.98-el6.parcel
SPARK-0.9.0-1.cdh4.6.0.p0.98-el6.parcel.sha

Comment below if you find this blog useful.