English 中文(简体)
Algorithmically suggest best node to perform demanding computation
原标题:

At work we perform demanding numerical computations.

We have a network of several Linux boxes with different processing capabilities. At any given time, there can be anywhere from zero to dozens of people connected to a given box.

I created a script to measure the MFLOPS (Million of Floating Point Operations per Second) using the Linpack Benchmark; it also provides number of cores and memory.

I would like to use this information together with the load average (obtained using the uptime command) to suggest the best computer for performing a demanding computation. In other words, its 3:00pm; I have a meeting in two hours; I need to run a demanding process: what node will get me the answer fastest?

I envision a script which will output a suggestion along the lines of:

SUGGESTED HOSTS (IN ORDER OF PREFERENCE)
HOST1.MYNETWORK
HOST2.MYNETWORK
HOST3.MYNETWORK

Such suggestion should favor fast computers (high MFLOPS) if the load average is low and, as load average increases for a given node, it should favor available nodes instead (i.e., I d rather run in a slower computer with no users than in an eight-core with forty dudes logged in).

How should I prioritize? What algorithm (rationale) would you use? Again, what I have is:

  1. Load Average (1min, 5min, 15min)
  2. MFLOPS measure
  3. Number of users logged in
  4. RAM (installed and available)
  5. Number of cores (important to normalize the load average)

Any thoughts? Thanks!

最佳回答

You don t have enough data to make an well-informed decision. It sounds as though the scheduling is very volatile: "At any given time, there can be anywhere from zero to dozens of people connected to a given box." So the current load does not necessarily reflect the future load of the machines.

To properly asses what hosts someone should use to minimize computation time would require knowing when the current jobs will terminate. If a powerful machine is about to be done doing most of its jobs, it would be a good candidate even though it currently has a high load.

If you want to guess purely on the current situation, you can do a weighed calculation to find out which hosts have the most MFLOPS available.

MFLOPS available = host s MFLOPS + (number of logical processors - load average)

Sort the hosts by MFLOPS available and suggest them in a descending order.

This formula assumes that the MFLOPS of a host is linearly related to its load average. This might not be exactly true, but it s probably fairly close.

I would favor the most recent load average since it s closer to the current/future situation, whereas, jobs from 15 minutes ago might have completed by now.

问题回答

Have you considered a distributed approach to computation? Not all computations can be broken up such that more than one cpu can work on them. But perhaps your problem space can benefit from some parallelization. Have a look at Hadoop.

You don t need to know FLOPS. beowulf modules paralell computing center has I go to has the script for sure

PDC operates leading-edge, high-performance computers on a national level. PDC offers easily accessible computational resources that primarily cater to the ...





相关问题
What does it mean "to write a web service"?

I just asked a question about whether it was possible to write a web-page-checking code and run it from free web server, and one supporter answered and said that it was possible only if I run "a web ...

How can I use exit codes to run shell scripts sequentially?

Since cruise control is full of bugs that have wasted my entire week, I have decided the existing shell scripts I have are simpler and thus better. Here is what I have so far svn update /var/www/...

Dynamically building a command in bash

I am construcing a command in bash dynamically. This works fine: COMMAND="java myclass" ${COMMAND} Now I want to dynamically construct a command that redirectes the output: LOG=">> myfile.log ...

Why does Scala create a ~/tmp directory when I run a script?

When I execute a Scala script from the command line, a directory named "tmp" is created in my home directory. It is always empty, so I simply deleted it without any apparent problem. Of course, when I ...

Ivy, ant and start scripts

I have a project that uses ant to build and ivy for dependencies. I would like to generate the start scripts for my project, with the classpath, based on the dependencies configured in Ivy, ...

热门标签