English 中文(简体)
用 had子对记忆限制进行预测
原标题:Specifying memory limits with hadoop
  • 时间:2011-11-05 01:25:43
  •  标签:
  • java
  • hadoop

I am trying to run a high-memory job on a Hadoop cluster (0.20.203). I modified the mapred-site.xml to enforce some memory limits.

  <property>
    <name>mapred.cluster.max.map.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapred.cluster.max.reduce.memory.mb</name>
    <value>4096</value>
  </property>
  <property>
    <name>mapred.cluster.map.memory.mb</name>
    <value>2048</value>
  </property>
  <property>
    <name>mapred.cluster.reduce.memory.mb</name>
    <value>2048</value>
  </property>

在我的工作中,我明确了我将需要多少记忆。 遗憾的是,尽管我正在用<代码>-Xmx2g进行我的工作。 (这项工作将仅以如此之多的记忆作为独一无二的运用)我需要要求我更多的记忆(作为指责,为什么?) 或被杀。

val conf = new Configuration()
conf.set("mapred.child.java.opts", "-Xms256m -Xmx2g -XX:+UseSerialGC");
conf.set("mapred.job.map.memory.mb", "4096");
conf.set("mapred.job.reduce.memory.mb", "1024");

由于我正在做一个身份减少器的工作,因此减员几乎不需留人。

  class IdentityReducer[K, V] extends Reducer[K, V, K, V] {
    override def reduce(key: K,
        values: java.lang.Iterable[V],
        context:Reducer[K,V,K,V]#Context) {
      for (v <- values) {
        context write (key, v)
      }
    }
  }

然而,减员仍在使用大量记忆。 是否能够让出租人提出不同于地图绘制者的不同联合核查机制论点? Hadoop杀死了减员,声称它正在使用3960甲基溴记忆! 而减员最终导致工作失败。 如何做到这一点?

TaskTree [pid=10282,tipID=attempt_201111041418_0005_r_000000_0] is running beyond memory-limits.
Current usage : 4152717312bytes.
Limit : 1073741824bytes.
Killing task.

UPDATE: 即使我具体说明了以cat作为地图仪和uniq作为压缩机和-Xms512M -Xmx1g -XX:+UseSerial GC将我的任务占用了2g的虚拟记忆! 这似乎在四分五裂。

TaskTree [pid=3101,tipID=attempt_201111041418_0112_m_000000_0] is running beyond memory-limits.
Current usage : 2186784768bytes.
Limit : 2147483648bytes.
Killing task.

更新:,用于改变记忆使用配置格式,具体提到, Java用户大多对身体记忆感兴趣,以防rash。 我认为,这正是我所希望的:如果实际记忆不足,我不想 no上一张地图。 然而,这些选择似乎都是由于难以管理的虚拟记忆限制而实施的。

问题回答

页: 1 https://ccp.cloudera.com/display/CDHDOC/CDH3+Release+History” rel=“noreferer”, 原文为0.2,但类似问题可能适用于后几版本:

...if you set mapred.child.ulimit, it s important that it must be more than two times the heap size value set in mapred.child.java.opts. For example, if you set a 1G heap, set mapred.child.ulimit to 2.5GB. Child processes are now guaranteed to fork at least once, and the fork momentarily requires twice the overhead in virtual memory.

It s also possible that setting mapred.child.java.opts programmatically is "too late"; you might want to verify it really is going into effect, and put it in your mapred-site.xml if not.





相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签