VaspMultiStageWorkChain

The inspiration to write the VaspMultiStageWorkChain comes from the Cp2kMultiStageWorkChain of aiida-lsmo package. In a nutshell, DFT codes are caple of performing different types of calculations. For instance, we can relax atomic position, lattice parameters, and/or unit cell shape. Consequently, we can use the resulting structure to perform more accurate electronic structure calculations. The aim of multistage approach would be bringing all different calculations which we want to perform under a same umberalla. Therfore, we can make an automated, robus, and reproducible pipeline of calculations which not necessirly would be done using same DFT code or same calculation setup. This important feature is enabled by providing the combinations/settings in the form of protocols.

Protocols and how they work

The VaspMultiStageWorkChain() allows us to combine any sequence of VASP calculation. This sequence of calculations can be provided as a YAML file. Although I provide a serties of different protocols, these can be easily supplied by user too. The shipped protocols with aiida-catmat are available in workchains/protocols/vasp. In the name of protocols, R and S stand for Relaxation and Static calculations. The 03 and 3 both mean using ISIF=3 in relaxation, however, the 03 only runs for 5 steps. The sequences of letters and numbers, shows the sequence of stages. For instance, R03R3S is a protocol that runs a short 5-step relaxation (R03), then inspects and tries to modify INCAR if it is ncessary, then runs a full relaxation (R3), and finally runs a static calculation (S) with slightly increased ENCUT and decreased KSPACING to obtain more acurate energies. The protocol file looks like:

stage_0:
    ALGO: Normal
    EDIFF: 1.0e-06
    ENCUT: 650
    IBRION: 2
    ISIF: 3
    ISMEAR: 0
    ISPIN: 2
    LORBIT: 11
    LREAL: Auto
    LWAVE: true
    NELM: 200
    NSW: 5
    PREC: Accurate
    SIGMA: 0.05
stage_1:
    ALGO: Normal
    EDIFF: 1.0e-06
    EDIFFG: -0.01
    ENCUT: 650
    IBRION: 2
    ISIF: 3
    ISMEAR: 0
    ISPIN: 2
    LORBIT: 11
    LREAL: Auto
    LWAVE: true
    NELM: 200
    NSW: 400
    PREC: Accurate
    SIGMA: 0.05
stage_2:
    ALGO: Normal
    EDIFF: 1.0e-07
    ENCUT: 700
    IBRION: -1
    ISIF: 3
    ISMEAR: 0
    ISPIN: 2
    LAECHG: true
    LCHARG: true
    LORBIT: 11
    LREAL: false
    LVHAR: true
    LWAVE: false
    NELM: 200
    NSW: 0
    PREC: Accurate
    SIGMA: 0.05

Currently, users can supply their own protocol either by modifying the settings in one of the exisiting protocols. To provide the user settings, one can define a dictionary and pass it as parameters:

user_incar = {
    'NPAR': 1,
    'GGA': 'PS',
    'ISPIN': 2,
    'ENCUT': 500,
    'LDAU': False,
}
builder = VaspMultiStageWorkChain.get_builder()
builder.parameters = Dict(dict=user_incar)

This will either add extra tags to the load protocol or overwrites the exisiting ones.

Kpoints

There are two ways to supply kpoints to the workchain. We can use the kpoints input:

kpoints = KpointsData()
kpoints.set_kpoints_mesh([1, 1, 1], offset=[0, 0, 0])
builder.vasp_base.vasp.kpoints = kpoints

or we can provide the kspacing tag where we can also make the gamma-centered or force parity in the generated mesh:

builder.kspacing = Float(0.242)
builder.kgamma = Bool(True)
builder.force_parity = Bool(True)

POTCAR sets

Currently, there are two sets of POTCAR mappings available:

  1. MPRelaxSet: these are ones taken from the Materials Project relax set.

  2. VASP: these are ones recommended by VASP

Hubbard parameters

Similar to POTCAR descriptions, there are two sets of U parameters available which both are extracted from Materials Project datasets:

  1. MITSet

  2. MPSet

Output dictionary

Once the calculation is finished, we will have a dictionary which containes really loads of parsed information for each stage of calculation from vasprun.xml and OUTCAR which looks like:

"stage_0_static": {
    "DFT+U": false,
    "band_gap_spin_down": 0.0,
    "band_gap_spin_up": 0.0,
    "band_gap_unit": "eV",
    "converged": true,
    "converged_electronically": true,
    "converged_ionically": true,
    "converged_magmoms": [
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0
    ],
    "energy_unit": "eV",
    "errors": {},
    "extra_parameters": {
        "amin": 0.1,
        "amix": 0.4,
        "amix_mag": 1.6,
        "bmix": 1.0,
        "bmix_mag": 1.0,
        "ebreak": 1.32e-06,
        "imix": 4,
        "ngx": 36,
        "ngxf": 72,
        "ngy": 36,
        "ngyf": 72,
        "ngz": 36,
        "ngzf": 72,
        "number_of_bands": 19,
        "number_of_electrons": 24.0
    },
    "fermi_energy": 0.49552331,
    "final_energy": -15.67042079,
    "final_energy_per_atom": -1.95880259875,
    "potcar_specs": [
        {
            "hash": null,
            "titel": "PAW_PBE Li_sv 10Sep2004"
        }
    ],
    "run_type": "PBEsol",
    "spin_polarized": true,
    "total_magnetization": -0.3907086

Moreover, if the relaxation is invloved, we would have relaxed structure as an output of the workchain. The fina INCAR for each stage of workchain also is reported:

Outputs                 PK    Type
----------------------  ----  -------------
final_incar
    stage_0_static      792   Dict
    stage_1_relaxation  802   Dict
    stage_2_static      813   Dict
output_parameters       821   Dict
structure               809   StructureData

Detailed inputs, outputs, and outline

workchainaiida_catmat.workchains.VaspMultiStageWorkChain

The ``VaspMultiStageWorkChain``

Inputs:

  • force_parity, Bool, optional – set to True to force parity in generated kpoint mesh
  • hubbard_tag, Str, optional – The string that controls which set of U parameters user wants to use
  • kgamma, Bool, optional – gamma centered kpoints in kspacing case
  • kspacing, Float, optional – The kspacing tag to generate kpoints mesh
  • magmom, List, optional – List of user supplied MAGMOM tag
  • max_stage_iteration, Int, optional – Maximum number of iterations/trials in case of failure.
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • parameters, Dict, required – The input parameters.
  • potcar_set, Str, optional – Select which potcar set should be used to construct mappin. VASP or MPRelaxSet
  • potential_family, Str, required – The string which defines which potential (POTCAR) familiy we want to use
  • potential_mapping, Dict, optional – The disctionary which controls which specific POTCAR user wants to use for each atom type.
  • protocol_tag, Str, optional – The string which controls which protocol to use for setting up the calculations.
  • restart_folder, RemoteData, optional – Remote folder with data to use for restarting a calculation
  • settings, Dict, optional
  • structure, (StructureData, CifData), required – The input structue to perform calculations on
  • vasp_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • vasp, Namespace
      Namespace Ports
      • code, Code, required – The Code to use for this job.
      • kpoints, KpointsData, optional – kpoints mesh
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • additional_retrieve_list, (list, tuple), optional, non_db – List of relative file paths that should be retrieved in addition to what the plugin specifies.
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
          • parser_name, str, optional, non_db – Parser of the calculation: the default is vasp_base_parser to get the necessary info
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • stash, Namespace – Optional directives to stash files after the calculation job has completed.
            Namespace Ports
            • source_list, (tuple, list), optional, non_db – Sequence of relative filepaths representing files in the remote directory that should be stashed.
            • stash_mode, str, optional, non_db – Mode with which to perform the stashing, should be value of `aiida.common.datastructures.StashMode.
            • target_base, str, optional, non_db – The base location to where the files should be stashd. For example, for the copy stash mode, this should be an absolute filepath on the remote computer.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db – Set the calculation to use mpi
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • potential, Namespace – The potentials (POTCAR).
      • restart_folder, RemoteData, optional – A remote folder to restart from if need be

Outputs:

  • final_incar, Namespace
  • magmom, List, optional – List of MAGMOM
  • output_parameters, Dict, required
  • structure, StructureData, optional

Outline:

initialize(Initialize inputs and settings)
while(should_run_next_stage)
    run_stage(Prepares and submits static calculations as long as they are needed.)
    inspect_stage(Do the inspection of finished stage!)
results(Attach the remaining output results.)