vSphere Big Data

View Only

Back to discussions

Expand all | Collapse all

Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

1. Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
splintereddy
Posted Feb 11, 2018 03:01 AM

Reply Reply Privately
Hi guys，
     Here is my lab environment:
          vcneter server 6.5U1 with big data extension2.3 intergrated
          cloudera manager 5.13.0 as the application manager,which is installed in centos 6.8 , same as the node template of big data extension
          the datastore room is enough,the addresses in ip ranges is enough too
          local dns server ,all the forward and reverse lookup is ok
          local cloudera manager yum repository and parcel repository.Besides i installed the cloudera manager agent,deamon and oracle j2sdk in node template already. (after installing i removed the snapshot of node template and restarted the management server)
     Here is the problem that i encountered:
          when i try to create a big data cluster using cloudera manager as the application, i can see the cdh is just the right version that i put in my local parcel repository.
          when i finished the process of creating cluster,the vms are cloned with proper ip and hostname that i've wrote in unbound.conf in my dns server。the cloudera manager agent is started and
the host agent installation is successful.
          but is ends up with the error:an exception happens when application manager creates the cluster.creation fails。
it seems that the cloudera manager couldn't install the parcel in hosts.
anyone khows why？
thank you .
2. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
Broadcom Employee

Qing Chi
Posted Feb 26, 2018 02:47 AM

Reply Reply Privately
Hi,
Cloud you provide the serengeti log file which under the directory /opt/serengeti/logs?
Thanks,
-qing
3. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
splintereddy
Posted Feb 27, 2018 08:11 AM

Reply Reply Privately
Thanks for your help。It might be the problem of os package dependency。
I found a way to create the big data cluster successfully,but still confused about how it works.
First of all i deploy the bigdata extensions , add the dns record to the dns server, add the cloudera manager as application manager,then i try to create the big data cluster using local repository(including cloudera manager agent , daemon, jdk ,etc) and default centos repo, it fails at installing the cloudera manager agent, the logs shows the error of package dependency
Then i install the agent in node template before creating the cluster , it fails again . i guess the agent should be unique in every node , but if formerly installed in template the id of agent is the same one.
Then i modify the template : rename all the Centos-*.repo except Centos-Media.repo to bak_Centos-*.repo and modify the Centos-Media.repo to use my local centos yum repository。
Finally succeed。
Here is my question：
     All the master and worker nodes can access to Internet，so the centos package dependency should be ok in theory, but it seems not.
     There is a shell script "set-local-repo" in /opt/serengeti/sbin directory in node template , it creates a "backup" directory and moves all the centos*.repo to it when i use local cloudera manager yum repo , what is the purpose of this function ? I comment part of the code (line 04-22) and it seems that there is no influence。I know that the vm would download and install packages according to the repo files in directory /etc/yum.repos.d/ , but what if all the repo files are moved to the subfolder ? Then what is the purpose of moving all os repos to the "backup" subfolder?
chmod 777 /etc/yum.repos.d
cd /etc/yum.repos.d
if [ ! -f /opt/serengeti/etc/keep_default_repo ]; then
# create a backup folder first
if [ ! -d "backup" ]; then
    mkdir backup
fi
# move all os repos to backup folder
if ls /etc/yum.repos.d/CentOS*.repo 1>/dev/null 2>&1; then
    mv -f CentOS*.repo backup
fi
if ls /etc/yum.repos.d/rhel*.repo 1>/dev/null 2>&1; then
    mv -f rhel*.repo backup
fi
if ls /etc/yum.repos.d/fedora*.repo 1>/dev/null 2>&1; then
    mv -f fedora*.repo backup
fi
fi
# for ambari we just return now
if [ $1 = "ambari" ]; then
if rpm -q mysql-libs-5.1.73
then
    yum remove -y mysql-libs-5.1.73
fi
exit 0
fi
# for cloudera-manager, we create a new local repo file
cat > aaa-local-app-manager.repo <<HERE
[$1]
name = local app manager yum server
baseurl = $2
gpgcheck = 0
enabled = 1
priority = 1
HERE
Besides there is another little problem:
     every time i reconnect to the vcenter server or reboot the bde server there will be an error showing that get big data clusters failed. the ssl certificate does not exist or is not trusted， then I need to disconnect and reconnect to the bde management server to work around this error temporarily。
     How can I fix this error？
4. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
Broadcom Employee

Qing Chi
Posted Feb 27, 2018 08:17 AM

Reply Reply Privately
Hi,
It is an issue of BDE GUI. we are fixing this issue. Cloud you file a SR to us? We can follow the status on the SR.
Thanks,
-qing
5. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
splintereddy
Posted Feb 28, 2018 05:13 AM

Reply Reply Privately
I don't know how to file a SR because I'm using bde for trial with temporary vcenter server license in my demo environment。
Besides , Is the bde developer team in China or is there any technical support In China？
I've heard that the bde would not have any update further, is that right ? Is there anyone using it in production environment ??
6. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
Broadcom Employee

Qing Chi
Posted Feb 28, 2018 06:05 AM

Reply Reply Privately
Hi,
You can file a SR on the page my.vmware.com if you have the production license.
The developer team of BDE is in China.
BDE is on the maintenance mode right now. There are many oversea customers still using BDE on production environment.
Thanks,
-qing
7. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
splintereddy
Posted Mar 01, 2018 09:18 AM

Reply Reply Privately
Well I don't have any production license bought from VMware , and my test environment will be expired in 2018.04.20.
I'm curious about the Operating mechanism of bde ，how it works when cloning and customizing a vm, deploying the hadoop parcels ,configuring services, etc. i've read some of scripts in it ,but still not very clear.
i also find that creating a small cluster using cloudera manager is fine, but it fails when creating a cluster with medium or larger cluster, so weird.
anyway ,thanks for your help
8. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
Broadcom Employee

Qing Chi
Posted Mar 01, 2018 09:30 AM

Reply Reply Privately
Hi,
1, BDE server can manage the resources of vCenter, like datastores, resource pools, networkings and so on.
2, BDE also can create many types of Hadoop clusters, like CDH, HDP and so on.
3, BDE can balance the hadoop clusters resources according to the vCenter resources, like datastores, racks.
Anyway, you can prepare a Hadoop cluster just using only one command.
Thanks,
-qing
9. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
splintereddy
Posted Mar 02, 2018 01:37 AM

Reply Reply Privately
Thanks for your reply.
Well I might not express it clearly enough, what i really want to figure out is that what is the function of every shell , python or ruby script , how it triggers the action of deploying hadoop components automatically， how it configures services .......
I mean the principle and details in it .Is there any technical papers on it ?
Besides, can you read chinise ?
我最近在研究bde，我觉得这种产品可以简化创建hadoop集群的前期准备工作，但是现在可能还有点小问题，不太适合在生产环境中使用。我也在研究如何使hadoop集群的创建、配置等工作更简单和自动化，所以想更加深入的了解bde内部工作流程和底层原理而不只是简单的在GUI界面点几下鼠标发布出来一个集群。
10. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
Broadcom Employee

Qing Chi
Posted Mar 03, 2018 02:13 AM

Reply Reply Privately
Hi,
I'm Chinese as well.
BDE 主要是提供在vSphere 基础架构上一键式创建Hadoop集群。可以通过指定jsony文件来定义Hadoop集群的配置情况。
下面是一个json文件的例子：nodeGroups->master 是定义一个Hadoop master node，worker是定义data node, client 是定义一些需要的services 的client. 在nodeGroups里同样也可以定义 Hadoop 的使用资源，比如CPU, memory, storage等等。haFlag 是定义启用vSphere HA功能，当某个node 出现了问题会自动重启这个node。configuration是提供修改Hadoop集群的配置的一个入口。
{
"nodeGroups":[
    {
      "name": "master",
      "roles": [
        "hadoop_namenode",
        "hadoop_resourcemanager"
      ],
      "instanceNum": 1,
      "cpuNum": 2,
      "memCapacityMB": 7500,
      "storage": {
        "type": "SHARED",
        "sizeGB": 50
      },
      "haFlag": "on",
      "configuration": {
        "hadoop": {
        }
      }
    },
    {
      "name": "worker",
      "roles": [
        "hadoop_datanode",
        "hadoop_nodemanager"
      ],
      "instanceNum": 3,
      "cpuNum": 2,
      "memCapacityMB": 7500,
      "storage": {
        "type": "LOCAL",
        "sizeGB": 50
      },
      "haFlag": "off",
      "configuration": {
        "hadoop": {
        }
      }
    },
    {
      "name": "client",
      "roles": [
        "hadoop_client",
        "hive",
        "hive_server",
        "pig"
      ],
      "instanceNum": 1,
      "cpuNum": 1,
      "memCapacityMB": 3748,
      "storage": {
        "type": "LOCAL",
        "sizeGB": 50
      },
      "haFlag": "off",
      "configuration": {
        "hadoop": {
        }
      }
    }
],
// we suggest running convert-hadoop-conf.rb to generate "configuration" section and paste the output here
"configuration": {
    "hadoop": {
      "core-site.xml": {
        // check for all settings at http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/core-default.xml
        // note: any value (int, float, boolean, string) must be enclosed in double quotes and here is a sample:
        // "io.file.buffer.size": "4096"
      },
      "hdfs-site.xml": {
        // check for all settings at http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
      },
      "mapred-site.xml": {
        // check for all settings at http://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
      },
      "hadoop-env.sh": {
        // "HADOOP_HEAPSIZE": "",
        // "HADOOP_NAMENODE_OPTS": "",
        // "HADOOP_DATANODE_OPTS": "",
        // "HADOOP_SECONDARYNAMENODE_OPTS": "",
        // "HADOOP_JOBTRACKER_OPTS": "",
        // "HADOOP_TASKTRACKER_OPTS": "",
        // "HADOOP_CLASSPATH": "",
        // "JAVA_HOME": "",
        // "PATH": ""
      },
      "yarn-site.xml": {
        // check for all settings at http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
      },
      "yarn-env.sh": {
        // "YARN_OPTS": "",
        // "YARN_HEAPSIZE": "",
        // "JAVA_HEAP_MAX": "",
        // "YARN_RESOURCEMANAGER_OPTS": "",
        // "YARN_RESOURCEMANAGER_HEAPSIZE": "",
        // "YARN_NODEMANAGER_OPTS": "",
        // "YARN_NODEMANAGER_HEAPSIZE": "",
        // "YARN_PROXYSERVER_OPTS": "",
        // "YARN_PROXYSERVER_HEAPSIZE": "",
        // "YARN_CLIENT_OPTS": "",
        // "YARN_ROOT_LOGGER": "",
        // "YARN_CLASSPATH": ""
      },
      "log4j.properties": {
        // "hadoop.root.logger": "INFO,RFA",
        // "log4j.appender.RFA.MaxBackupIndex": "10",
        // "log4j.appender.RFA.MaxFileSize": "100MB",
        // "hadoop.security.logger": "DEBUG,DRFA"
      },
      "fair-scheduler.xml": {
        // check for all settings at http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
        // "text": "the full content of fair-scheduler.xml in one line"
      },
      "capacity-scheduler.xml": {
        // check for all settings at http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
      }
    }
}
}
BDE 会根据用户提供的json文件去分配资源。基本步骤如下：
1, 根据资源计算出Hadoop node放置的位置，例如:放在哪个Host，哪个Storage.
2, 从BDE template VM克隆出需要的Hadoop node，并且放在已经计算好的Host上。
3, 启动Hadoop node, 出始化配置（networking, Storage）,所有node得到 IP 和 FQDN之后，Hadoop集群所需要的基础架构就好了。
4, BDE会根据用户使用的App manager执行自动化部署Hadoop的services 并且按需要启动他们。
BDE的优势在于用户可以根据自己的需要随时创建和删除Hadoop集群。不需要每次创建Hadoop集群太多的准备基础架构(Host, network, storage)，这样会大大减少IT的工作量。具我所知目前使用BDE的用户中最大集群有大概256个Hadoop data node。而且运行的很稳定。
11. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

Recommend
mogleygull
Posted Jul 02, 2019 03:21 PM

Reply Reply Privately
Try using this big database management service support. The solurtions performed by this company are designed for any kind of business and performance problem diagnostics.

vSphere Big Data

Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

splintereddyFeb 11, 2018 03:01 AM

Qing ChiFeb 26, 2018 02:47 AM

splintereddyFeb 27, 2018 08:11 AM

Qing ChiFeb 27, 2018 08:17 AM

splintereddyFeb 28, 2018 05:13 AM

Qing ChiFeb 28, 2018 06:05 AM

splintereddyMar 01, 2018 09:18 AM

Qing ChiMar 01, 2018 09:30 AM

splintereddyMar 02, 2018 01:37 AM

Qing ChiMar 03, 2018 02:13 AM

mogleygullJul 02, 2019 03:21 PM

1. Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

2. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

3. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

4. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

5. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

6. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

7. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

8. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

9. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

10. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3

11. RE: Cannot create big data cluster using cloudera manager as application manager in big data extension2.3