Appendix B | Troubleshooting |
|
|
|
|
This appendix offers initial suggestions for what to do when something goes wrong with applications running together with SMC. When problems occur, first check the list of common errors and their solutions; an updated list of
Problems and fixes reported to Scali will eventually be included in the appropriate sections of this manual. Please send relevant remarks by
Many problems find their origin in not using the right application code, daemons that Scali MPI Connect rely on are stopped, and incomplete specification of network drivers. Below some typical problems and their solutions are described. Troubleshooting the DAT functionality is described in
B-1 When things do not work - troubleshooting
This section is intended to serve as a starting point to help with software and hardware debugging. The main focus is on locating and repairing faulty hardware and software setup, but can also be helpful in getting started after installing a new system. For a description of the Scali Manage GUI, see the Scali System Guide.
B-1.1 Why does not my program start to run?
mpimon: command not found.
Include /opt/scali/bin in the PATH environment variable. mpimon can’t find mpisubmon.
Set MPI_HOME=/opt/scali or use the
The application has problems loading libraries (libsca*).
Update the LD_LIBRARY_PATH to include /opt/scali/lib.
Incompatible MPI versions.
mpid, mpimon, mpisubmon and the libraries all have version variables that are checked at
1.Set the environment variable MPI_HOME correctly
2.Restart mpid, because a new version of ScaMPI has been installed without restarting mpid
3.Reinstall SMC, because a new version of SMC was not cleanly installed on all nodes.
Set working directory failed
SMC assumes that there is a homogenous
ScaMPI uses wrong interface for
Set SCAMPI_NODENAME to hostname of correct interface.
MPI_Wtime gives strange values
SMC uses a
Scali MPI Connect Release 4.4 Users Guide | 54 |