English 中文(简体)
flink on yarn application mode couldn t deploy yarn application cluster
原标题:

flink on yarn application mode, publish job timeout while yarn cluster still has enough resources to be used.

Used Resources:<memory:22GB, vCores:22>
Total Resources:<memory:109.54GB, vCores:64>

the yarn-site.xml parameters for resource configuration are as follows

 <property>
        <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
</property>
<property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
</property>
<property>
        <name>yarn.resourcemanager.authorization.enabled</name>
        <value>false</value>
</property>

exception info is as follows

2023-07-12 19:14:59,212:ERROR Thread-4724 (Yarncluster.java: 84) - ClusterDeploymentException :
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn t deploy Yarn Application Cluster
    at org.apache.flink.yarn.YarnclusterDescriptor.deployApplicationCluster(YarnClusterDescriptor-java:465) 
    ...
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_322] 
    at java.Lang.Thread.run Thread.java: 750) [?:1.8.0_322]
Caused by: java.lang.InterruptedException: sleeps
    at java.lang.Thread.sleep (Native Method)~[?:1.8.0_322]
    at org.apache, flink.yarn.YarnClusterDescriptor.startAppMaster (YarnclusterDescriptor.java: 1223) ~
    at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593)~[flink-! 
    at org.apache.flink.yarn.YarnclusterDescriptor.deployApplicationcluster (YarnClusterDescriptor.java: 458 
    ...4 more

i tried turn on yarn.nodemanager.resource.detect-hardware-capabilities to true in yarn-site.xml which enable auto-detection of node capabilities such as memory and CPU, but it doesn t work.

is anyone knows the issue and thanks in advance for any replies.

问题回答

暂无回答




相关问题
Backstage Docker build error in yarn step

I created a Dockerfile following the steps in https://backstage.io/docs/deployment/docker/#multi-stage-build When running the Dockerfile at the root "docker image build -t backstage .", I m ...

Issues with Hadoop on Ubuntu22.02.2 LTS

I am trying to run Hadoop on Ubuntu 22.04.2 LTS. I downloaded and followed all the steps outlined in the tutorials online. However, I am getting this error: 2023-06-26 22:43:51,513 INFO namenode....

Hadoop namenode : Single point of failure

The Namenode in the Hadoop architecture is a single point of failure. How do people who have large Hadoop clusters cope with this problem?. Is there an industry-accepted solution that has worked ...

热门标签