English 中文(简体)
C++ programming for clusters and HPC
原标题:

I need to write a scientific application in C++ doing a lot of computations and using a lot of memory. I have part of the job but due to high requirements in terms of resources I was thinking to start moving to OpenMPI.

Before doing that I have a simple curiosity: If I understood the principle of OpenMPI correctly it is the developer that has the task of splitting the jobs over different nodes calling SEND and RECEIVE based on node available at that time.

Do you know if it does exist some library or OS or whatever that has this capability letting my code reamain as it is now? Basically something that connects all computers and let share as one their memory and CPU?

I am a bit confused because of the huge volume of material available on the topic. Should I look at cloud computing? or Distributed Shared Memory?

最佳回答

Currently there is no C++ library or utility that will allow you to automatically parallelize your code across a cluster of machines. Granted that there are a lot of ways to achieve distributed computing with other approaches, you really want to be optimizing your application to use message passing or distributed shared memory.

Your best bets would be to:

  1. Convert your implementation into a task-based solution. There are a lot of ways to do this but this will most definitely done by hand.
  2. Clearly identify where you can break the tasks up and how these tasks essentially communicate with each other.
  3. Use a higher level library that builds on OpenMPI/Mpich -- Boost.MPI comes to mind.

Implementing a parallel distributed solution is one thing, making it work efficiently is another though. Read up on different topologies and different parallel computing patterns to make implementing solutions a little less painful than if you had to start from scratch.

问题回答

Well, you haven t actually stated exactly what the hardware you are targetting is, if it s a shared-memory machine then OpenMP is an option. Most parallel programmers would regard parallelisation with OpenMP as an easier option than using MPI in any of its incarnations. I d also suggest that it is easier to retrofit OpenMP to an existing code than MPI. The best, in the sense of best-performing, MPI programs are those designed from the ground up to be parallelised with message-passing.

In addition, the best sequential algorithm might not always be the most efficient algorithm, once it has been parallelised. Sometimes a simple, but sequentially-sub-optimal algorithm is a better choice.

You may have access to a shared-memory computer:

  • all multicore CPUs are effectively shared-memory computers;
  • on a lot of clusters the nodes are often two or four CPUs strong, if they each have 4 cores then you might have a 16-core shared-memory machine on your cluster;
  • if you have access to an MPP supercomputer you will probably find that each of its nodes is a shared-memory computer.

If you are stuck with message-passing then I d strongly advise you to stick with C++ and OpenMPI (or whatever MPI is already installed on your system), and you should definitely look at BoostMPI too. I advise this strongly because, once you step outside the mainstream of high-performance scientific computing, you may find yourself in an army of one programming with an idiosyncratic collection of just-fit-for-research libraries and other tools. C++, OpenMPI and Boost are sufficiently well used that you can regard them as being of weapons-grade or whatever your preferred analogy might be. There s little enough traffic on SO, for example, on MPI and OpenMP, check out the stats on the other technologies before you bet the farm on them.

If you have no experience with MPI then you might want to look at a book called Parallel Scientific Computing in C++ and MPI by Karniadakis and Kirby. Using MPI by Gropp et al is OK as a reference, but it s not a beginner s text on programming for message-passing.

If message passing is holding you down, try distributed objects. There are a lot of distributed object frameworks available. CORBA, DCOM, ICE to name a few... If you choose to distribute your objects, your objects will have global visibility through the interfaces(both data and methods) you will define. Any object in any node can access these distributed objects.

I have been searching for software that allows distributing memory, but haven t come across any. I guess its because you have all these distributed object frameworks available, and people don t have any need for distributing memory as such.

I had a good experience using Top-C in graduate school.

From the home page: "TOP-C especially distinguishes itself as a package to easily parallelize existing sequential applications."

http://www.ccs.neu.edu/home/gene/topc.html

Edit: I should add, it s much simpler to parallelize a program if it uses "trivial parallelism". e.g. Nodes don t need to share memory. Mapreduce is built on this concept. If you can minimize the amount of shared state your nodes use, you ll see orders of magnitude better improvements from parallel processing.





相关问题
Undefined reference

I m getting this linker error. I know a way around it, but it s bugging me because another part of the project s linking fine and it s designed almost identically. First, I have namespace LCD. Then I ...

C++ Equivalent of Tidy

Is there an equivalent to tidy for HTML code for C++? I have searched on the internet, but I find nothing but C++ wrappers for tidy, etc... I think the keyword tidy is what has me hung up. I am ...

Template Classes in C++ ... a required skill set?

I m new to C++ and am wondering how much time I should invest in learning how to implement template classes. Are they widely used in industry, or is this something I should move through quickly?

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

typedef ing STL wstring

Why is it when i do the following i get errors when relating to with wchar_t? namespace Foo { typedef std::wstring String; } Now i declare all my strings as Foo::String through out the program, ...

C# Marshal / Pinvoke CBitmap?

I cannot figure out how to marshal a C++ CBitmap to a C# Bitmap or Image class. My import looks like this: [DllImport(@"test.dll", CharSet = CharSet.Unicode)] public static extern IntPtr ...

Window iconification status via Xlib

Is it possible to check with the means of pure X11/Xlib only whether the given window is iconified/minimized, and, if it is, how?

热门标签