Welcome to Windows HPC Community Sign in | Join | Help
in Search

parallel debugging in MS Visual Studio 05

Last post 11-13-2007, 4:30 AM by HPC-Erlangen. 17 replies.
Page 1 of 2 (18 items)   1 2 Next >
Sort Posts: Previous Next
  •  09-17-2007, 2:41 PM 1653

    parallel debugging in MS Visual Studio 05

    Hello,

    I wish to debug my mpi programs using the MS visual studio 05. I have been trying hard on this. Read the documents available, but haven't been able to do it yet.

    I have a few doubts

    1. Do I need to install the compute cluster server to use the debugger ?

    2. Is there a documentation which is at even low level than the current one ?

    Thanks in advance

    -Kedar

  •  09-18-2007, 6:11 AM 1654 in reply to 1653

    Re: parallel debugging in MS Visual Studio 05

    Hi Kedar.

    Maybe I can help this time instead of just posting questions and complaining about missing features in CCS. :-)

    > 1. Do I need to install the compute cluster server to use the debugger ?
    No. You only need a working VS2005 Pro or Team Suite installation and your MPI. I successfully used MPICH2 and MSMPI (there is an SDK available that you can use e.g. on your laptop).

    > 2. Is there a documentation which is at even low level than the current one ?
    There are several MSDN articles on that topic, search for the "MPI Cluster Debugger" and you will find things like this: http://download.microsoft.com/download/6/8/d/68d7d82b-e477-4699-b403-72be2e6218b1/CCS03DebugParallelAppsVS05.doc, it explains the basics of MPI debugging with VS2005 and also how to debug your jobs running on a CCS.

    For our user base, we provide "Windows-HPC" courses on a regular base. While the video transcripts are in german, the slides are in english and this one might help you: http://www.rz.rwth-aachen.de/computing/events/2007/winhpc_2007/CT_06_Debugging.pdf, slides 18-21 are on MPI-debugging with VS2005, maybe on the lower level you asked for.

     

    Hope this helps,
    Christian
    --
    Dipl.-Inform. Christian Terboven - High Performance Computing
    RWTH Aachen University, Center for Computing and Communication
    Seffenter Weg 23, D 52074 Aachen (Germany)
    Phone.: +49 241 80 24375 - Fax: +49 241 80 22504
    mailto:terboven@rz.rwth-aachen.de http://www.rz.rwth-aachen.de

  •  09-19-2007, 12:29 AM 1658 in reply to 1654

    Re: parallel debugging in MS Visual Studio 05

    Thanks a lot. The presentation was a great help. I was having spaces in the path of mpishim and hence whenever I tried running the program a command window used to pop up and disappear before I could read anything in it.

    So now I am parallel !!

    Debugging in parallel environment is a little bit different from the serial one. I could detach and attach process in different VS sessions (as said in the presentation) but still haven't got a good hold on paralle debugging. Any suggestions, documents will be of highly appreciated.

    Thanks in advance

    Kedar

  •  10-23-2007, 8:42 AM 1760 in reply to 1654

    Re: parallel debugging in MS Visual Studio 05

    Hi Kedar, hi Christian,

    I am too, having problems with the parallel debugging of MPI Jobs, perhaps you know some advice where to search for a solution.
    First of all, thanks for the online slides and video transcripts!

    I've a MPI CFD solver as well as a simple MPI HelloWorld and both runs well on the cluster with different nodes and setups.
    However none of the suggestions for the MPI Cluster Debugger works in my setup.

    Configuration of the Debugger:
    Debugger to launch:  MPI CLuster Debugger
    MPIRun Command:    mpiexec   
    MPIRun Arguments: -n 2
    MPIRun Working Directory:
    Application Command: c:\temp\HelloWorldMPI.exe
    Application Arguments:
    MPIShim Location:      c:\temp\mpishimx64\mpishim.exe
    MPI network security mode: Accept connections from any adress

    After hitting the Debug button, with this configuration, a command prompt appears and vanishes immediately. There is nothing written in it and nothing else happens.

    I've tried this with various different Application Commands, using UNC paths, using a working directory and only the executable name.

    A wrong written mpishim.exe leads to a window with the follwing message:

    launch failed: C:\WINDOWS\system32>c:\temp\mpishimx64\mpishim.exe2 c:\temp\mpish
    imx64\mpishim.exe2 CCSMASTER c:\temp\HelloWorldMPI.exe on 'ccsmaster' failed
    Error (2) The system cannot find the file specified.

    Perhaps you know any hint ?

    Running both programs from command line with mpiexec is no problem and running on the cluster performs without problems, too.
    Thanks in advance

    Johannes







  •  10-27-2007, 11:05 PM 1772 in reply to 1760

    Re: parallel debugging in MS Visual Studio 05

    To find out why you are getting the flushing command prompt, ask VS to pause the window at the end of the execution. or alternatively create a batch file that includes the following,

    --- myruncmd.bat ----
    mpiexec %*
    pause
    ---------

    replace the MPIRun command to myruncmd.bat (or cmd /c myruncmd.bat)

    HTH,
    .Erez

     

  •  10-29-2007, 7:13 AM 1775 in reply to 1772

    Re: parallel debugging in MS Visual Studio 05

    erezh:

    To find out why you are getting the flushing command prompt, ask VS to pause the window at the end of the execution. or alternatively create a batch file that includes the following,

    --- myruncmd.bat ----
    mpiexec %*
    pause
    ---------

    replace the MPIRun command to myruncmd.bat (or cmd /c myruncmd.bat)

    HTH,
    .Erez

     



    Hello ,

    I've done the above suggestion and the command prompt now shows the following:


    C:\WINDOWS\system32>mpiexec -n 2 "c:\temp\mpishimx64\mpishim.exe" CCSMASTER "C:\temp\HelloWorldMPI\x64\debug\HelloWorldMPI.exe"

    C:\WINDOWS\system32>pause
    Press any key to continue . . .

    But still, nothing further happens.
    The same with the mpiexec command attaching to a issued job running msvsmon:
    C:\WINDOWS\system32>mpiexec -n 2 -job 657.0 "c:\temp\mpishimx64\mpishim.exe" CCS
    MASTER "HelloWorldMPI.exe"

    C:\WINDOWS\system32>pause
    Press any key to continue . . .



    Issuing the program itself  from command line:

    C:\temp\HelloWorldMPI\x64\debug>mpiexec -n 2 HelloWorldMPI.exe
    Startup!
    Hello world from process 0 of 2
    Startup!
    Hello world from process 1 of 2

    C:\temp\HelloWorldMPI\x64\debug>


    Everything performs as expected.
    For testing purposes I moved the VS 2005 project to one of our compute nodes, with exactly the same properties. There the debugging starts and stops at the given breakpoints. Although I gave a -job ID to the mpiexec command, nothing is executed on other nodes than the current node.
    The only major difference between the developing node and the compute nodes is, that the developing node is Head Node  at the same time and  has the  Intel C++ and Fortran Compilers installed. They provide  plugins for Visual Studio 2005.

    Any ideas?

    Thanks in advancs

    Johannes





  •  10-29-2007, 12:22 PM 1777 in reply to 1775

    Re: parallel debugging in MS Visual Studio 05

    Hi Johannes,

    My hunch, is that your debugger is not installed correctly. see https://msdn2.microsoft.com/en-us/library/ms164731.aspx.

     

    As for debugging compute nodes remotely, you can either,
    1. submit a job that runs until canceled and use the jobid.taskid as the string to mpiexec -job
    2. change your mpirun command to be, 'job submit mpiexec'

    This is needed because the compute nodes are a resource that is managed by the CCS scheduler. Thus, you need to let the scheduler know that you're using the resource. (otherwise access will be denined)

    thanks,
    .Erez

  •  10-31-2007, 6:36 AM 1784 in reply to 1777

    Re: parallel debugging in MS Visual Studio 05

    Hello erezh,

    the documentation at the link you provided was the guideline for the tests I described above.
    Unfortunately  I came back today for testing and it worked, without reconfiguring something, thus I don't know where the problem was.
    Perhaps just a service was hanging and needed a restart, which I performed yesterday.

    No I am invoking a new job for each debug run, with the following credentials:


    MPIRUN:                job submit /askednodes:ccs008,ccs009 mpiexec
    MPIRUN Args:         -np16
    App Command:         \\somuncpath\HelloWorldMPI.exe
    MPISHIM Loc:        c:\temp\mpishimx64\mpishim.exe
    MPI NETWORK:   accept from any address

    All 16 processes are now executed on node ccs008.

    How can I specifiy to distribute the threads among the compute nodes?

    If I specifiy - host ccs008,ccs009 as an mpiexec argument, I get an MPI Stack error:

    mpiexec running on ccs008 is unable to connect to msmpi service on ccs008,ccs009:8677

    Other MPI error, error stack:
    MPIDU_Sock_post_connect_filter(1278): unable to connect to ccs008,ccs009 on port 8677, no endpoints
    MPIDU_Sock_post_connect_filter(1298): gethostbyname failed, The requested name is valid, but no data of the requested type was found.  (errno 11004)

    Usual MPI jobs work  on two and more nodes.

    Thanks in advance,

    Johannes



  •  11-01-2007, 12:15 AM 1790 in reply to 1784

    Re: parallel debugging in MS Visual Studio 05

    remove the -np 16 option to mpiexec

    mpiexec gets the right number of processes to spawn from the CCS scheduler.

  •  11-01-2007, 8:34 AM 1792 in reply to 1790

    Re: parallel debugging in MS Visual Studio 05

    Hi,

    yes I think it should be that way. However, if I remove the statement just leaving:

    MPIRUN Command:   
     job submit  /askednodes:ccs008,ccs009 /stderr:\\ccsmaster\ccsshare\rohabich\err.txt  mpiexec


    only 1 Process is spawning.

    If i add the /numprocessor:8-8 flag, only 4 processes are spawning on one node, none on the other.

    What I would like to have would be 4 on node ccs008 and 4 on node ccs009.

    Thanks,

    Johannes
  •  11-05-2007, 1:54 AM 1796 in reply to 1792

    Re: parallel debugging in MS Visual Studio 05

    Yes, you need the /numprocessors flag; that should have done it and you should be executing 4 processes on each node (assuming you have 4 proc on each box).

    could it be that the debugger shim is not set correctly on the other node?

    I suggest that you also specify the stdout file too (some errors are reported there), it can be the same file as stderr.

    I also suggest that you try to use job submit without the debugger first, to verify that jobs are executed correctly on your cluster.

     

    Thanks,
    .Erez

  •  11-05-2007, 1:59 AM 1797 in reply to 1796

    Re: parallel debugging in MS Visual Studio 05

    Execute the following command to make sure jobs are executed correctly,

    job submit  /numprocessors:8 /askednodes:ccs008,ccs009 /stdout:\\ccsmaster\ccsshare\rohabich\out.txt /stderr:\\ccsmaster\ccsshare\rohabich\out.txt mpiexec -l hostname

    The output should be the rank number and the name of the host. You should see 4 ranks on each host

  •  11-12-2007, 5:47 AM 1822 in reply to 1797

    Re: parallel debugging in MS Visual Studio 05

    Hello,

    Thanks for the help.
    The out.txt file contains the following:
    [3]ccs008
    [2]ccs008
    [7]ccs009
    [1]ccs008
    Devilccs009
    [0]ccs008
    [5]ccs009
    [4]ccs009

    So everything should be right with the cluster setup.
    Your suggestion that the installation of the debugger has a problem seems to be right.
    However I don't know how to find the problem.
    I'm trying to describe briefly what I did.

    From a central share I executed the remote debugger installation wizard from
    \\ccsmaster\c$\Program Files\Microsoft Visual Studio 8\Microsoft Visual Studio 2005 Remote Debugger (x64) - ENU\install.exe

    I installed it not as a service but I launch the executable from C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\Remote Debugger\x64\msvsmon.exe  on every node I want to debug, with the same User that is going to launch the debugging in Ms VS 2005.

    Additionally the folder C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\Remote Debugger\x64 is present at c:\temp\mpishimx64\ on  every node.

    Now I configure my MPi Cluster Debugger inside MS VS 2005 according to the earlier suggestions and press the Debug Button.
    It works nowhere, except on the node were I have the full MS VS2005 installed.

    On the other nodes, the msvsmon.exe  shows that someone is connecting, but Visual Studio doesn't get into the debugging phase and the job cancels.
    The error and the output file are both empty.

    Thanks,

    Johannes
  •  11-12-2007, 5:57 PM 1831 in reply to 1822

    Re: parallel debugging in MS Visual Studio 05

    okay, so you have msvcmon running on each compute node.

    lets verify that the simple hostname application. set the debugger as follow,

    MPIRUN:  job submit /askednodes:ccs008,ccs009 /numprocessors:8 mpiexec /stderr:\\ccsmaster\ccsshare\rohabich\err.txt  /stdout:\\ccsmaster\ccsshare\rohabich\err.txt 
    MPIRUN Args:        
    MPIRun Working Directory:
    App Command:         hostname.exe
    MPISHIM Loc:        c:\temp\mpishimx64\mpishim.exe
    MPI NETWORK:   accept from any address

    You should see the same results in the output.

  •  11-12-2007, 5:57 PM 1832 in reply to 1831

    Re: parallel debugging in MS Visual Studio 05

    correction:

    MPIRUN:  job submit /askednodes:ccs008,ccs009 /numprocessors:8 /stderr:\\ccsmaster\ccsshare\rohabich\err.txt  /stdout:\\ccsmaster\ccsshare\rohabich\err.txt  mpiexec

Page 1 of 2 (18 items)   1 2 Next >
View as RSS news feed in XML
©2006 Microsoft Corporation. All rights reserved. Terms of Use |Trademarks |Privacy Statement
Powered by Community Server, by Telligent Systems