flink on yarn application mode, publish job timeout while yarn cluster still has enough resources to be used.
Used Resources:<memory:22GB, vCores:22>
Total Resources:<memory:109.54GB, vCores:64>
the yarn-site.xml
parameters for resource configuration are as follows
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.resourcemanager.authorization.enabled</name>
<value>false</value>
</property>
exception info is as follows
2023-07-12 19:14:59,212:ERROR Thread-4724 (Yarncluster.java: 84) - ClusterDeploymentException :
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn t deploy Yarn Application Cluster
at org.apache.flink.yarn.YarnclusterDescriptor.deployApplicationCluster(YarnClusterDescriptor-java:465)
...
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_322]
at java.Lang.Thread.run Thread.java: 750) [?:1.8.0_322]
Caused by: java.lang.InterruptedException: sleeps
at java.lang.Thread.sleep (Native Method)~[?:1.8.0_322]
at org.apache, flink.yarn.YarnClusterDescriptor.startAppMaster (YarnclusterDescriptor.java: 1223) ~
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593)~[flink-!
at org.apache.flink.yarn.YarnclusterDescriptor.deployApplicationcluster (YarnClusterDescriptor.java: 458
...4 more
i tried turn on yarn.nodemanager.resource.detect-hardware-capabilities
to true in yarn-site.xml
which enable auto-detection of node capabilities such as memory and CPU, but it doesn t work.
is anyone knows the issue and thanks in advance for any replies.