General Recommendations for Parallelization

Recommendations for achieving speedup for OptiStruct jobs.

Note: Performance for specific models may vary due to various factors, including, model type/size, processor count, processor type, disk, memory, I/O bandwidth, local/mounted drives, system load, and so on.
OptiStruct Model/Solution Parallelization Recommendation Comments
Linear Static Hybrid (SMP + DDM)
  • You can use as many DDM processes (-np) as allowed by available memory, and the remainder of the cores can be distributed as SMP (-nt).
  • If multiple subcases with different BCs exist, then Multi-level DDM will likely be automatically turned ON. Different BCs will be parallelized.
  • Using the PCG solver could show performance benefit in cases where DDM cannot be run. Refer to SOLVTYP and Solvers for more information.
Nonlinear Static Analysis Hybrid (SMP +DDM)
  • You can use as many DDM processes (-np) as allowed by available memory, and the remainder of the cores can be distributed as SMP (-nt).
  • For multiple nonlinear subcases, if you believe that solving multiple nonlinear subcases in parallel can provide more speedup than sequential, then multi-level DDM can be manually turned on via PARAM,DDMNGRPS,# (refer to Domain Decomposition Method for certain exceptions).
Nonlinear Transient Analysis Hybrid (SMP + DDM)
  • You can use as many DDM processes (-np) as allowed by available memory, and the remainder of the cores can be distributed as SMP (-nt).
  • Multi-level DDM is not supported. Therefore, subcase-based parallelization is not supported.
Normal Modes Analysis using Lanczos (Eigenvalue Analysis) Hybrid (DDM + SMP)
  • Geometric-partitioning via DDM is supported for Lanczos, and DDM can be used via -np for #cores up to the limit enforced by available memory. The remaining cores can be distributed via SMP (-nt).
Normal Modes Analysis using AMSES (Eigenvalue Analysis) Single Modal Space - SMP

Multiple Modal Spaces - Hybrid (DDM + SMP)

  • Geometric-partitioning via DDM is not supported for AMSES.
  • If there is only a single modal space, SMP can be used via -nt.
  • If there are multiple modal spaces, then BC-based parallelization is supported for AMSES and a hybrid SMP+DDM can be used. -np can be set equal to the number of modal spaces and the remaining cores can be distributed via SMP (-nt).
Normal Modes Analysis using AMLS (Eigenvalue Analysis) SMP
  • Geometric-partitioning via DDM is not supported for AMLS. Similarly, BC-based parallelization is not supported either. Therefore, only SMP is recommended.
Linear Modal Frequency Response Analysis

This only discusses FRF solution part. For Eigenvalue extraction, see above.

Hybrid (DDM + SMP)
  • Modal Frequency Response parallelization depends on two main parts:
    • Eigenvalue extraction
    • FRF solution
  • The performance depends on which solution is dominant with regards to the runtime. If the #modes are high, and #loading frequencies are low, eigenvalue extraction can dominate; otherwise, the FRF solution dominates.
  • Eigenvalue extraction has already been covered above.
  • For Modal FRF solution using regular factorization, BCs are solved sequentially and loading frequencies are parallelized via DDM. Such frequency partitioning via DDM can speedup the solution.
  • For Modal FRF solution using FASTFR, DDM may not be useful, unless there is a good compensation via higher eigen-extraction DDM performance.
Linear Modal Transient Response Analysis

This only discusses Transient Energy Equation solution part. For Eigenvalue extraction, see above.

Hybrid (DDM + SMP)
  • Modal Transient Response parallelization depends on two main parts:
    • Eigenvalue extraction
    • Transient Response solution
  • The performance depends on which solution is dominant with regards to the runtime. If the #modes are high, and transient response solution is quick, eigenvalue extraction can dominate; otherwise, the transient solution dominates.
  • Eigenvalue extraction has already been covered above.
  • For Modal Transient Response solution, BCs are parallelized via Multi-level DDM. However, DDM level-2 geometric partitioning is not supported for Transient solution within each MPI group. Therefore, in such cases, certain MPI processes within each group may remain idle.
Linear Direct Transient Analysis Hybrid (SMP + DDM)
  • For small models, SMP may outperform DDM since BCs are not parallelized for Linear Direct Transient.
  • For larger models, DDM may provide more speedup since geometric-partitioning is supported.
  • The Newmark-Beta time integration method may improve performance in certain cases.
Linear Direct Frequency Response Analysis Hybrid (SMP + DDM)
  • For small models, SMP may outperform DDM since BCs are not parallelized for Linear Direct Frequency Response.
  • For larger models, DDM may provide more speedup since frequency-splitting is supported and distributed among different MPI process groups and within each group geometric-partitioning is performed using MUMPS.