DFTB+ recipes

Introduction

This document is a collection of examples demonstrating the usage of the atomistic quantum mechanical software DFTB+.

Before you start

The examples assume that you have the latest stable version of DFTB+ installed (although many of the recipes may also work with older versions of the code). Additionally the examples require some parameterisation files (Slater-Koster files), which can be downloaded from dftb.org.

The recipes in this document often only show the relevant parts of the input. In order to obtain the full input files and in order to run the examples yourself, please download the archive containing all the inputs of the individual recipes.

This can be downloaded from the online version of the DFTB+ recipes at https://dftbplus-recipes.readthedocs.io/.

In each recipe we indicate where to find the corresponding directories in the archive with square brackets after the section title (e.g. [Input: recipes/basics/firstcalc/]).

Installing from conda-forge

DFTB+ is available with the cross platform package manager conda on the conda-forge channel. If you have no conda installation yet, we recommend to bootstrap an installation with the conda-forge distribution miniforge or the anaconda distribution miniconda.

Installing dftb+ from the conda-forge channel can be achieved by adding conda-forge to your channels with:

conda config --add channels conda-forge

Once the conda-forge channel has been enabled, dftb+ can be installed with:

conda install dftbplus

If you prefer to install an MPI parallel version you have to explicitly request it with

conda install 'dftbplus=*=mpi_*'

Additional components, like the dptools and the Python API, are available as separate packages on the same channel. You can install them with

conda install dftbplus-tools dftbplus-python

It is possible to list all of the versions of dftb+ and its additional components available on your platform with:

conda search 'dftbplus*' --channel conda-forge

Getting Slater-Koster data

After unpacking the archive, you must also download all Slater-Koster-files needed by the examples, by using the supplied script as

./scripts/get_slakos

in the root folder of the archive. You should then be able to run each example by calling dftb+ from the corresponding input folder (assuming that dftb+ is installed in your executable path). For some examples, there is also a supplied script file in the directory to run examples of multistage calculations, called run.sh which contains the individual commands needed to run the full example.

Where to start

The individual chapters are more or less independent from each other, so you may go directly to the one relevant to your interests. However, if you are new to DFTB+, please make sure to work through the relevant introductory examples in the Basic usage chapters first.

The recipes are to introduce you to specific functionalities of DFTB+ and so are, therefore rather short and focused. Please also always consult the corresponding sections of the DFTB+ manual for further details and possibilities.

Please note that the example outputs in the recipes may have been created with older versions of DFTB+ and therefore could differ slightly in format from output of the most recent code. The corresponding inputs in the archive should work, without any changes, with the last stable release of DFTB+.

Basic usage

This part contains recipes for some of the basic tasks often performed with DFTB+. This is the best place to start if you are new to this software package.

First calculation with DFTB+

[Input: recipes/basics/firstcalc/]

This chapter should serve as a tutorial guiding you through your first calculation with DFTB+. As an example, the equilibrium geometry of the water molecule will be calculated. As with all simulation tools, the process consists of three steps:

  • telling DFTB+ what to do,
  • running DFTB+,
  • analysing the results.

Providing the input

DFTB+ accepts the input in the Human-readable Structured Data (HSD) format. The input file must be called dftb_in.hsd. The input file used in this example looks as follows:

Geometry = GenFormat {
3 C
  O H

  1 1  0.00000000000E+00 -0.10000000000E+01  0.00000000000E+00
  2 2  0.00000000000E+00  0.00000000000E+00  0.78306400000E+00
  3 2  0.00000000000E+00  0.00000000000E+00 -0.78306400000E+00
}

Driver = ConjugateGradient {
  MovedAtoms = 1:-1
  MaxForceComponent = 1E-4
  MaxSteps = 100
  OutputPrefix = "geom.out"
}

Hamiltonian = DFTB {
  Scc = Yes
  SlaterKosterFiles {
    O-O = "../../slakos/mio-ext/O-O.skf"
    O-H = "../../slakos/mio-ext/O-H.skf"
    H-O = "../../slakos/mio-ext/H-O.skf"
    H-H = "../../slakos/mio-ext/H-H.skf"
  }
  MaxAngularMomentum {
    O = "p"
    H = "s"
  }
}

Options {}

Analysis {
  CalculateForces = Yes
}

ParserOptions {
  ParserVersion = 7
}

The order of the specified blocks in the HSD input is arbitrary. You are free to capitalise the keywords and any physical units as you like, since they are case-insensitive. This is not valid however for string values, especially if they are specifying file names.

Geometry

The Geometry block contains types and coordinates of the atoms in your system. The geometry of the system in the sample input file is provided in the so called “gen” format, which was the traditional geometry input format of the DFTB method. The formal description of this format can be found in the DFTB+ manual. In the current example, the geometry is

Geometry = GenFormat {
  3  C                   # 3 atoms, non-periodic cluster
   O H                   # Two elements, 1 - O, 2 - H
  #  Index Type  Coordinates
       1    1    0.00000000000E+00  -0.10000000000E+01   0.00000000000E+00
       2    2    0.00000000000E+00   0.00000000000E+00   0.78306400000E+00
       3    2    0.00000000000E+00   0.00000000000E+00  -0.78306400000E+00
}

which specifies a cluster of 3 atoms of chemical types O and H (cluster geometries having open boundary conditions as distinct from periodic supercell geometries). The Cartesian coordinates of the atoms in the “gen” format are given in Angstroms. The first column of integers contains the sequential numbering of the atoms in the system (the actual values are ignored by the parser). The second column contains the type of each atom, given as the position of the appropriate element in the element list of the second line of the “gen” data. The GenFormat{} is not the only way to specify the geometry, you should check the manual for other methods.

As demonstrated above, it is possible to put arbitrary comments in the HSD input after a hash-mark (#) character. Everything between this character and the end of the current line is ignored by the parser.

Very often, the geometry is stored in an external file. To save you the copying and pasting from that file into the dftb_in.hsd file, you can use the file inclusion feature of the HSD format:

Geometry = GenFormat {
  <<< "geometry.gen"
}

The <<< operator includes the specified file as raw text data. (The file is not checked for any HSD constructs.) In the example above, the file geometry.gen must be in gen format.

Driver

After having specified the geometry of your system, you should decide what DFTB+ will do with that geometry. The Driver environment determines how the geometry should be changed (if at all) during the calculation. If you only would like to make a static calculation, you must either set it to an empty value like

Driver {}   # Empty value for the driver

or omit the Driver block completely from dftb_in.hsd.

In the current example:

# Do conjugate gradient optimisation
Driver = ConjugateGradient {
  MovedAtoms = 1:-1               # Move all atoms in the system
  MaxForceComponent = 1.0e-4      # Stop if maximal force below 1.0e-4
  MaxSteps = 100                  # Stop after maximal 100 steps
  OutputPrefix = "geom.out"       # Final geometry in geom.out.{xyz,gen}
}

the molecule is relaxed using the conjugate gradient method. The entire range of atoms from the first (atom 1) until and including the last (-1) is allowed to move. Instead of 1:-1 you could also have written:

MovedAtoms = 1:3               # Atoms from the 1st until the 3rd

or

MovedAtoms = O H               # Select O and H atoms.

or

MovedAtoms = 1 2 3              # Explicitely listing all atom numbers.

In our case the geometry optimisation continues as long as the maximum component of the force acting on the moving atoms is bigger than 1e-4 atomic units (Hartree per Bohr radius). Numeric values are by default interpreted to be in atomic units. However the HSD format offers the possibility of using alternative units by specifying a unit modifier before the equals sign. This is given in square brackets. For example instead of the original atomic units, you could have used

MaxForceComponent [eV/AA] = 5.14e-3    # Force in Electronvolts/Angstrom

or

MaxForceComponent [Electronvolt/Angstrom] = 5.14e-3

See the manual for the list of accepted modifiers.

The MaxSteps keyword specifies the maximum number of geometry optimisation steps that the program can take before stopping, even if the specified tolerance for the maximal force component have not been achieved by that stage of the calculation.

Finally, the OutputPrefix keyword specifies the name of the file to be written that will contain the present geometry during the optimisation (and then the final geometry at the end of the calculation). The geometry is written in gen and xyz formats to the files obtained by appending “.gen” and “.xyz” suffixes to the specified name (geom.out.gen and geom.out.xyz in our case.) The dptools package distributed with DFTB+ contains scripts (gen2xyz and xyz2gen) to convert between the gen and the xyz formats (and various other formats).

Hamiltonian

You have to decide upon the model used to describe your system in order to calculate its properties. At the moment DFTB+ simplifies this decision quite a lot, since it currently only supports types of Density Functional based Tight Binding Hamiltonians (with some extensions). In our example, the chosen self-consistent DFTB Hamiltonian has the following properties:

Hamiltonian = DFTB {                 # DFTB Hamiltonian
  Scc = Yes                          # Use self consistent charges
  SlaterKosterFiles {                # Specifying Slater-Koster files
    O-O = "../../slakos/mio-ext/O-O.skf"
    O-H = "../../slakos/mio-ext/O-H.skf"
    H-O = "../../slakos/mio-ext/H-O.skf"
    H-H = "../../slakos/mio-ext/H-H.skf"
  }
  MaxAngularMomentum {               # Maximal l-value of the various species
    O = "p"
    H = "s"
  }
}

In this example the charge self-consistent DFTB (SCC-DFTB) method is used for the electronic structure (and calculating the total energy, forces, etc.). This method includes the effect of charge transfer between atoms of the system. In order to find the final ground state of the system it has to iteratively solve the system, until the atomic charges are self-consistently converged. Convergence is reached if the difference between the charges used to build the Hamiltonian and the charges obtained after the diagonalisation of the Hamiltonian is below a certain tolerance (the default is 1e-5 electrons, but can be tuned with the SccTolerance option). If this level of convergence is not reached within a certain number of iterations, the code calculates the total energy using the charges obtained so far and stops with an appropriate warning message. The maximal number of scc-iterations is by default 100, but can be changed via the MaxSccIterations option.

The tabulated integrals (together with other atomic and diatomic parameters) necessary for building the DFTB Hamiltonian are stored in the so called Slater-Koster files. Those files always describe the interaction between atom pairs. Therefore, you have to specify, for each pairwise combination of chemical elements in your system, the corresponding Slater-Koster file:

SlaterKosterFiles {               # Specifying Slater-Koster files
  O-O = "../../slakos/mio-ext/O-O.skf"
  O-H = "../../slakos/mio-ext/O-H.skf"
  H-O = "../../slakos/mio-ext/H-O.skf"
  H-H = "../../slakos/mio-ext/H-H.skf"
}

If you use a consistent file naming convention, you can avoid typing all the file names by specifying only the generating pattern. The input:

SlaterKosterFiles = Type2FileNames {    # File names with two atom type names
  Prefix = "../../slakos/mio-ext/"    # Prefix before first type name
  Separator = "-"                     # Dash between type names
  Suffix = ".skf"                     # Suffix after second type name
}

would generate exactly the same file names as in the example above.

Historically the Slater-Koster file format did not contain any information about which valence orbitals were considered when generating the interaction tables, this can lead to data for physically inappropriate orbitals being included in the files. Therefore, you must provide the value of the highest orbital angular momentum each element, specified as s, p, d or f. This information can be obtained from the documentation of the Slater-Koster files. In the distributed standardised sets (available at http://www.dftb.org) this information is contained in the documentation appended to the end of each SK-file.

The default behaviour of the code is to assume that your system is neutral (net electrical charge of 0). If you would like to calculate charged systems, you have to use the Charge option. Similarly, the system is assumed to be spin-unpolarised. You can however use the option SpinPolarisation to change this standard behaviour.

Analysis

The Analysis block contains options to calculate (or display if otherwise only calculated internally) a number of properties. In this example, while forces are needed to optimise the geometry, these are not usually printed in full, only the maximum value. The CalculateForces option enables printing of the forces.

Options

The Options block contains a few global settings for the code. In the current example, no options are specified. You could even leave out the:

Options {}

line in the input, since the default value for the Options block is an empty block.

ParserOptions

This block contains options which are interpreted by the parser itself and are not passed to the main program. The most important of those options is the ParserVersion option, which tells the parser, for which version of the parser the current input file was created for. If this is not the current parser but an older one, the parser internally automatically converts the old input to the new format.

The version number of the parser in the current DFTB+ code is always printed out at the program start. It is a good habit to set this value in your input files explicitly, like in our case:

ParserVersion = 7

This allows you to use your input file with future versions of DFTB+ without adapting it by hand, if the input format has changed in the more recent version.

Running DFTB+

After creating the main input file, you should make sure that all the other required files (Slater-Koster files, any files included in the HSD input via <<< constructs, etc.) are at the right place. In our example, only the Slater-Koster files need to be present.

In order to run the calculation, you should invoke DFTB+ without any arguments in the directory containing the file dftb_in.hsd. As DFTB+ writes some useful output to the standard output (to the screen), it is recommended to save this output for later investigation:

dftb+ | tee output

Assuming the binary dftb+ is in your search path, you should obtain an output starting with:

|===============================================================================
|
|  DFTB+ (Release 17.1)
|
|  Copyright (C) 2017  DFTB+ developers group
|
|===============================================================================
|
|  When publishing results obtained with DFTB+, please cite the following
|  reference:
|
|  * B. Aradi, B. Hourahine and T. Frauenheim,
|    DFTB+, a Sparse Matrix-Based Implementation of the DFTB Method,
|    J. Phys. Chem. A, 111 5678 (2007).  [doi: 10.1021/jp070186p]
|
|  You should also cite additional publications crediting the parametrization
|  data you use. Please consult the documentation of the SK-files for the
|  references.
|
|===============================================================================


***  Parsing and initializing

Parser version: 5

Interpreting input file 'dftb_in.hsd'
--------------------------------------------------------------------------------
Reading SK-files:
  O-O.skf
  O-H.skf
  O-H.skf
  H-H.skf
Done.


Processed input in HSD format written to 'dftb_pin.hsd'

Starting initialization...
--------------------------------------------------------------------------------
Mode:                        Conjugate gradient relaxation
Self consistent charges:     Yes
SCC-tolerance:                 0.100000E-04
Max. scc iterations:                    100
Ewald alpha parameter:         0.000000E+00
Spin polarisation:           No
Nr. of up electrons:             4.000000
Nr. of down electrons:           4.000000
Periodic boundaries:         No
Diagonalizer:                Relatively robust (version 1)
Mixer:                       Broyden mixer
Mixing parameter:                  0.200000
Maximal SCC-cycles:                     100
Nr. of chrg. vec. in memory:              0
Nr. of moved atoms:                       3
Max. nr. of geometry steps:             100
Force tolerance:               0.100000E-03
Force evaluation method:     Traditional
Electronic temperature:        0.100000E-07
Initial charges:             Set automatically (system chrg:   0.000E+00)
Included shells:             O:  s, p
                             H:  s
Extra options:
                             Mulliken analysis
Force type                   original


--------------------------------------------------------------------------------

***  Geometry step: 0

    iSCC Total electronic   Diff electronic      SCC error
    1   -0.39511797E+01    0.00000000E+00    0.88081627E+00
    2   -0.39705438E+01   -0.19364070E-01    0.55742893E+00
    3   -0.39841371E+01   -0.13593374E-01    0.32497352E-01
    4   -0.39841854E+01   -0.48242063E-04    0.19288772E-02
    5   -0.39841856E+01   -0.17020682E-06    0.87062163E-05

 Total Energy:                      -3.9798793068 H         -108.2980 eV
 Total Mermin free energy:          -3.9798793068 H         -108.2980 eV
 Maximal force component:            0.187090E+00
>> Charges saved for restart in charges.bin

--------------------------------------------------------------------------------

***  Geometry step: 1

  iSCC Total electronic   Diff electronic      SCC error
    1   -0.40495559E+01    0.00000000E+00    0.92334735E-01
.
.
.

If this is the case, you have managed to run DFTB+ for the first time. Congratulations!

Examining the output

DFTB+ communicates through two channels with you: by printing information to standard output (which you should redirect into a file to keep for later evaluation) and by writing information into various files. In the following, the most important of these files will be introduced and analysed

Standard output

The first thing appearing in standard output after the start of DFTB+ is the program header:

|===============================================================================
|
|  DFTB+ (Release 17.1)
|
|  Copyright (C) 2017  DFTB+ developers group
|
|===============================================================================
|
|  When publishing results obtained with DFTB+, please cite the following
|  reference:
|
|  * B. Aradi, B. Hourahine and T. Frauenheim,
|    DFTB+, a Sparse Matrix-Based Implementation of the DFTB Method,
|    J. Phys. Chem. A, 111 5678 (2007).  [doi: 10.1021/jp070186p]
|
|  You should also cite additional publications crediting the parametrization
|  data you use. Please consult the documentation of the SK-files for the
|  references.
|
|===============================================================================


***  Parsing and initializing

Parser version: 5

This tells you which program you are using (DFTB+), which release (17.1) and the paper(s) associated with the code. Then the version of the parser used in this DFTB+ release is listed.

As already discussed above, it can be a good habit to set this version number explicitly in your input inside the ParserOptions block, so that:

ParserOptions {
  ParserVersion = 7
}

Next, the parser starts to interpret your input, then reads in the necessary SK-files and writes the full input settings to dftb_pin.hsd:

Interpreting input file 'dftb_in.hsd'
--------------------------------------------------------------------------------
Reading SK-files:
  O-O.skf
  O-H.skf
  O-H.skf
  H-H.skf
Done.


Processed input in HSD format written to 'dftb_pin.hsd'

You do not have to explicitly set all the possible options for DFTB+ in the input, as for most of them there are default values set by the parser if not set in the input. If you want to know which default values have been set for those missing specifications, you should look at the processed input file dftb_pin.hsd, which contains the value for all the possible input settings (see next the subsection).

At this point the DFTB+ code is then initialised, and the most important parameters of the calculation are then printed out:

Mode:                        Conjugate gradient relaxation
Self consistent charges:     Yes
SCC-tolerance:                 0.100000E-04
Max. scc iterations:                    100
Ewald alpha parameter:         0.000000E+00
Spin polarisation:           No
Nr. of up electrons:             4.000000
Nr. of down electrons:           4.000000
Periodic boundaries:         No
Diagonalizer:                Relatively robust (version 1)
Mixer:                       Broyden mixer
Mixing parameter:                  0.200000
Maximal SCC-cycles:                     100
Nr. of chrg. vec. in memory:              0
Nr. of moved atoms:                       3
Max. nr. of geometry steps:             100
Force tolerance:               0.100000E-03
Force evaluation method:     Traditional
Electronic temperature:        0.100000E-07
Initial charges:             Set automatically (system chrg:   0.000E+00)
Included shells:             O:  s, p
                             H:  s
Extra options:
                             Mulliken analysis
Force type                   original

As you can see, all quantities (e.g. force tolerance, electronic temperature) are converted to the internal units of DFTB+, namely atomic units (with Hartree as the base energy unit).

Then the program starts:

***  Geometry step: 0

    iSCC Total electronic   Diff electronic      SCC error
    1   -0.39511797E+01    0.00000000E+00    0.88081627E+00
    2   -0.39705438E+01   -0.19364070E-01    0.55742893E+00
    3   -0.39841371E+01   -0.13593374E-01    0.32497352E-01
    4   -0.39841854E+01   -0.48242063E-04    0.19288772E-02
    5   -0.39841856E+01   -0.17020682E-06    0.87062163E-05

 Total Energy:                      -3.9798793068 H         -108.2980 eV
 Total Mermin free energy:          -3.9798793068 H         -108.2980 eV
 Maximal force component:            0.187090E+00
>> Charges saved for restart in charges.bin
:

Since this is an SCC calculation, DFTB+ has to iterate the charges until the specified convergence criteria is fulfilled. In every cycle, you get information about the values of the electronic energy, its difference to the value in the previous SCC cycle, and the discrepancy (error) between the charges used to build the Hamiltonian and the charges obtained after its solution. This final value is relevant to the tolerance specified in the input (SccTolerance).

If the SCC cycle has converged, the total energy (including SCC and repulsive contributions) is calculated, and similarly the total Mermin free energy (this is the Helmholtz free energy, but where only the electronic entropy is included). Additionally the biggest force component in the system is indicated.

Then the driver changes the geometry of the system, and the self-consistent cycle is repeated as before but for the new geometry. This process continues as long as the geometry does not converge:

***  Geometry step: 12

  iSCC Total electronic   Diff electronic      SCC error
    1   -0.41505816E+01    0.00000000E+00    0.20115717E-02
    2   -0.41505816E+01   -0.21681791E-07    0.14908557E-02
    3   -0.41505816E+01   -0.26422777E-07    0.27122328E-07

 Total Energy:                      -4.0779379339 H         -110.9663 eV
 Total Mermin free energy:          -4.0779379339 H         -110.9663 eV
 Maximal force component:            0.280551E-05
>> Charges saved for restart in charges.bin

 Geometry converged

If the geometry does not converge before the maximum number of geometry steps is reached, the code will stop and you will get an appropriate warning message. Assuming the MaxSteps option had been set to 6 in the input, you would obtain:

***  Geometry step: 6

  iSCC Total electronic   Diff electronic      SCC error
    1   -0.41414806E+01    0.00000000E+00    0.12690850E-01
    2   -0.41414816E+01   -0.96478820E-06    0.93483401E-02
    3   -0.41414827E+01   -0.11442335E-05    0.17373439E-05

 Total Energy:                      -4.0774103506 H         -110.9520 eV
 Total Mermin free energy:          -4.0774103506 H         -110.9520 eV
 Maximal force component:            0.207962E-01
>> Charges saved for restart in charges.bin
WARNING!
-> !!! Geometry did NOT converge!
dftb_pin.hsd

As already mentioned, the processed input file dftb_pin.hsd is an input file generated from your dftb_in.hsd by including the default values for all unspecified options and converting some of the input quantities to atomic units. For example, in our case in the ConjugateGradient block several unspecified options would appear, for which sensible default values have been set:

Driver = ConjugateGradient {
  MovedAtoms = 1:-1
  MaxForceComponent = 1E-4
  MaxSteps = 100
  OutputPrefix = "geom.out"
  LatticeOpt = No
  MaxAtomStep = 0.20000000000000001
  AppendGeometries = No
  ConvergentForcesOnly = Yes
  Constraints {}
}

Similarly, in the DFTB{} block the switch for the orbital resolved SCC, for example, had been set to the default value of No:

OrbitalResolvedScc = No

Options which have been explicitly set in the input are unchanged. The file dftb_pin.hsd is itself a valid HSD input file, and you can use it as input (after renaming it to dftb_in.hsd) to re-run the calculation. It is always in the format suitable for the current parser, even if the input in dftb_in.hsd was for an older format (indicated by the appropriate ParserVersion option). Therefore, the ParserVersion option in the processed input file dftb_pin.hsd is always set to the current version of the parser which generated the file.

detailed.out

This file contains detailed information about the properties of your system. It is updated continuously during the run, by the end of the calculation will contain values calculated during the last SCC cycle. All the numerical values given in this file are in atomic units, unless explicitly specified otherwise.

detailed.out contains (among other data) the number of the last geometry step, a summary of the last SCC cycle and coordinates of any moved atoms:

Geometry optimization step: 12


********************************************************************************
  iSCC Total electronic   Diff electronic      SCC error
    3   -0.41505816E+01   -0.26422777E-07    0.27122328E-07
********************************************************************************

 Coordinates of moved atoms (au):
    1      0.00000000     -1.35303527     -0.00000000
    2     -0.00000000     -0.26834536      1.47115110
    3      0.00000000     -0.26834536     -1.47115110

Then the populaton analysis information follows:

Total charge:    -0.00000000

Atomic gross charges (e)
Atom           Charge
   1      -0.59261515
   2       0.29630757
   3       0.29630757

Nr. of electrons (up):      8.00000000
Atom populations (up)
 Atom       Population
    1       6.59261515
    2       0.70369243
    3       0.70369243

l-shell populations (up)
 Atom Sh.   l       Population
    1   1   0       1.73421713
    1   2   1       4.85839802
    2   1   0       0.70369243
    3   1   0       0.70369243

Orbital populations (up)
 Atom Sh.   l   m       Population
    1   1   0   0       1.73421713
    1   2   1  -1       1.68107958
    1   2   1   0       1.17731844
    1   2   1   1       2.00000000
    2   1   0   0       0.70369243
    3   1   0   0       0.70369243

It shows the total charge of the system and the charges for each atom, followed by detailed population analyis for each atom, shell and orbital.

Then you obtain a count of the total number electrons in the system, and the number of electrons on each atom, each atomic shell of the atoms (s, p, d, etc.) and each atomic orbital (labelled by their mz value) as calculated by Mulliken-analysis:

Nr. of electrons (up):      8.00000000
Atom populations (up)
 Atom       Population
    1       6.59261515
    2       0.70369243
    3       0.70369243

l-shell populations (up)
 Atom Sh.   l       Population
    1   1   0       1.73421713
    1   2   1       4.85839802
    2   1   0       0.70369243
    3   1   0       0.70369243

Orbital populations (up)
 Atom Sh.   l   m       Population
    1   1   0   0       1.73421713
    1   2   1  -1       1.68107958
    1   2   1   0       1.17731844
    1   2   1   1       2.00000000
    2   1   0   0       0.70369243
    3   1   0   0       0.70369243

In our case, due to the electronegativity difference, the hydrogen atoms are positively charged (having only 0.704 electrons), while the oxygen atom is negatively charged (6.59 electrons, instead of the neutral state of 6 valence electrons).

The file then contains the Fermi energy, the different energy contributions to the total energy and the total energy in Hartrees and electron-volts. If you are calculating at a finite electronic temperature, you should consider using the Mermin free energy instead of the total energy:

Fermi level:                        0.0700447751 H            1.9060 eV
Band energy:                       -3.6725069692 H          -99.9340 eV
TS:                                 0.0000000000 H            0.0000 eV
Band free energy (E-TS):           -3.6725069692 H          -99.9340 eV
Extrapolated E(0K):                -3.6725069692 H          -99.9340 eV
Input/Output electrons (q):      8.00000000      8.00000000

Energy H0:                         -4.1689433198 H         -113.4427 eV
Energy SCC:                         0.0183617102 H            0.4996 eV
Total Electronic energy:           -4.1505816095 H         -112.9431 eV
Repulsive energy:                   0.0726436756 H            1.9767 eV
Total energy:                      -4.0779379339 H         -110.9663 eV
Extrapolated to 0:                 -4.0779379339 H         -110.9663 eV
Total Mermin free energy:          -4.0779379339 H         -110.9663 eV

Between the two blocks of energy data, the input and output electron numbers at the last Hamiltonian diagonalisation are shown, so that you can check that no electrons get lost during the calculation.

This is then followed by a confirmation that the SCC convergence has been reached in the last geometry step:

SCC converged

You should always make sure that this is true, so that the properties of your system have been calculated by using convergent charges. Values obtained by using non convergent charges are usually meaningless.

Finally you get the forces on the atoms in your system. You get also the maximal force component occurring in your system and the maximal force occurring among the moved atoms. After this, the dipole moment of the system (in atomic units and Debye) is printed where possible. The end of the file will then show whether the geometry optimisation has reached convergence, i.e., all force components on the moved atoms are below the specified tolerance:

Full geometry written in geom.out.{xyz|gen}

Total Forces
 -1.0881602793401035E-026   6.8304105649286129E-008   4.3629613810658441E-012
 -1.9606916877279574E-016  -3.4153820160920390E-008  -2.8055131119641974E-006
  1.9606916878367734E-016  -3.4150285529999103E-008   2.8055087490097552E-006

Maximal derivative component:       0.280551E-05 au
Max force for moved atoms::         0.280551E-05 au

Dipole moment  :   -0.00000000    0.64280367    0.00000000 au
Dipole moment  :   -0.00000000    1.63384410    0.00000000 Debye

Geometry converged

As indicated above, in the current case, the final relaxed geometries can be found stored as xyz and gen format in the output files geom.out.xyz and geom.out.gen.

band.out

This file contains the energies of the individual electronic levels (orbitals) in electronvolts, followed by the occupation of the individual single particle levels for all of the possible spin channels. For spin unpolarised calculations (like this one) you will get only one set of values, since the levels are spin restricted and are twofold degenerate. In a collinear spin polarised calculation you would obtain separate values for the spin up and spin down levels:

KPT            1  SPIN            1  KWEIGHT    1.0000000000000000
   -23.10209     2.00000
   -11.27470     2.00000
    -8.53769     2.00000
    -7.05252     2.00000
    10.86455     0.00000
    15.19442     0.00000

The eigenenergies are in units of electron volts. You can use the scripts dp_bands in the dptools package to convert the data in band.out to XNY-format, which can be visualised with common 2D plotting tools.

Despite its name, the file band.out is also created for non-periodic systems, containing the eigenenergies and occupation numbers for molecular systems (You should ignore the k-point index and the k-point weight in the first line in this case).

results.tag

If you want to process the results of DFTB+ with another program, you should not extract the information from the standard output or the human readable output files (detailed.out, band.out, etc.), since their format could significantly change between subsequent releases of DFTB+. By setting the WriteResultsTag to Yes in the Options {} block:

Options {
  WriteResultsTag = Yes
}

you obtain the file results.tag at the end of your calculation, which contains some of the most important data in a format easily parsed by a script or a program. This file contains entries like:

forces              :real:2:3,3
 -0.108816027934010E-025  0.683041056492861E-007  0.436296138106584E-011
 -0.196069168772796E-015 -0.341538201609204E-007 -0.280551311196420E-005
  0.196069168783677E-015 -0.341502855299991E-007  0.280550874900976E-005

In the first line the name of the quantity is given, followed by its type (real, integer, logical). Then the rank of the quantity is given (0: scalar, 1: vector, 2: rank 2 matrix, etc.), followed by the size of each dimension. Following this, the data for the given quantity is dumped as free format.

Other output files

There are also other output files not discussed in detail here. They are only created, if appropriate choices in the Options or ExcitedState blocks are set. Please consult the manual for further details.

Band structure, DOS and PDOS

This chapter demonstrates, using the example of anatase (TiO2), how the band structure, density of states (DOS) and the partial density of states (PDOS) of a periodic system (such as wires, surfaces or solids) can be obtained using DFTB+.

The conversion scripts used here are part of the dptools package, which is distributed with DFTB+. In order to perform the calculations in this chapter, you will need the Slater-Koster sets mio and tiorg. The sample input files assume that the necessary Slater-Koster files have been copied into a subdirectory mio-ext.

Introduction

The calculation of the band structure for a periodic system consists of two steps.

  • First for self-consistent (SCC) calculations, the charges in the system must be calculated using a converged k-point sampling.
  • Then, keeping the obtained self-consistent charges fixed, the one-electron levels must be calculated for k-points chosen along the specific lines in k-space of the chosen band structure. These are usually between high symmetry points in the Brillouin zone of that unit cell.

Creating the proper input charges

[Input: recipes/basics/bandstruct/1_density/]

In order to calculate a band structure in Density Functional Theory (DFT), at first the ground-state density for the given system must be obtained. In the DFTB picture, this corresponds to obtaining the self-consistent charges of the atoms. These charges must be convergent with respect to two quantities in order to give correct results:

  • Tolerance of the SCC cycle and
  • quality of the k-point sampling grid.

In the current tutorial, the SCC tolerance is set to be 1e-5. For the k-point sampling, the \(8 \times 8 \times 8\) Monkhorst-Pack set will be used. Both quantities ensure good convergence in the charges for anatase, but may not be suitable for other applications.

We will also use the results of the converged calculation to obtain information about the density of states (DOS) and the partial density of states (PDOS) of anatase. This information only makes sense when extracted from a system with a good k-point sampling. A sample dftb_in.hsd input looks like:

Geometry = GenFormat {
    6  F
 Ti  O
    1 1    0.4393045491E-02   -0.4394122690E-02   -0.4185505032E-06
    2 1   -0.2456050838E+00   -0.7543932244E+00    0.5000007729E+00
    3 2    0.1997217007E+00    0.2106836749E+00   -0.1813953963E-02
    4 2   -0.4625010039E+00    0.4843137675E-01    0.4981557672E+00
    5 2   -0.2106822274E+00   -0.1997223911E+00    0.1816384188E-02
    6 2   -0.4843281768E-01   -0.5374990457E+00    0.5018414482E+00
    0.0000000000E+00    0.0000000000E+00    0.0000000000E+00
   -0.1903471721E+01    0.1903471721E+01    0.4864738245E+01
    0.1903471721E+01   -0.1903471721E+01    0.4864738245E+01
    0.1903471721E+01    0.1903471721E+01   -0.4864738245E+01
}

Hamiltonian = DFTB {
  Scc = Yes
  SccTolerance = 1e-5
  SlaterKosterFiles = Type2FileNames {
    Prefix = "../../../slakos/mio-ext/"
    Separator = "-"
    Suffix = ".skf"
  }
  MaxAngularMomentum {
    Ti = "d"
    O  = "p"
  }
  KPointsAndWeights = SupercellFolding {
    4 0 0
    0 4 0
    0 0 4
    0.5 0.5 0.5
  }
}

Analysis {
  ProjectStates {
    Region {
      Atoms = Ti
      ShellResolved = Yes
      Label = "dos_ti"
    }
    Region {
      Atoms = O
      ShellResolved = Yes
      Label = "dos_o"
     }
  }
}

ParserOptions {
  ParserVersion = 7
}

In the input above, the coordinates have been specified in relative (fractional) coordinates, which express the positions of the atoms as a linear combination of the lattice vectors. This is indicated by using the letter F in the first line of the geometry specification:

Geometry = GenFormat {
    6  F
 :

The k-points are generated automatically using the SupercellFolding method, which enables among others the generation of Monkhorst-Pack schemes. In the current example, a k-point set equivalent to the Monkhorst-Pack scheme \(4 \times 4 \times 4\) has been chosen (For details how to specify the coefficients and the shift vectors, please consult the manual).:

KPointsAndWeights = SupercellFolding {
  4 0 0
  0 4 0
  0 0 4
  0.5 0.5 0.5
}

You can check, by generating denser k-point sets, that the current choice gives an accuracy in the range of 1e-3 eV for the total energy. Also, by specifying a smaller SCC tolerance than the chosen one (1e-5) you can check that converging the charges more precisely does not significantly decrease the total energy. We note in passing that these settings provide well converged results for the total energy in the current example, but in principal may not provide converged values for other properties. One should, in principal, test the convergence of any evaluated properties with respect to the calculation parameters.

We will plot the DOS of this system by using the output in the file band.out. In order to also obtain a PDOS as well, the appropriate atoms (on to which the electronic states should be projected) are also specified. The resulting data will then be stored in separate files. In practice, this is done in the Analysis block using the ProjectStates options. In our example:

Analysis {
  ProjectStates {
    Region {
      Atoms = Ti
      ShellResolved = Yes
      Label = "dos_ti"
    }
    Region {
      Atoms = O
      ShellResolved = Yes
      Label = "dos_o"
     }
  }
}

we decide to get the PDOS for the Ti and the O atoms separately. Each Region block specifies the atoms (either selected by species, atomic ranges, or as a combination of both), for which PDOS should be created. Additionally, you can select, whether you would like to see each atomic shell of the atoms in a region (s, p, d, etc.) separately or together for that region. With the Label tag you can specify the prefix for the data files created. Using the settings above, we will obtain 5 files: dos_ti.1.dat, dos_ti.2.dat, dos_ti.3.dat, dos_o.1.dat and dos_o.2.dat. The first three contain the PDOS for the s, p, and d shells of Ti, while the last two files provide the oxygen s and p shells.

Plotting the density of states

You can use the dp_dos program from the dptools package to take the eigenlevels stored in band.out, apply a gaussian smearing to them, and to store the result in a format, which can be easily plotted by any 2D visualization tool. You have to issue:

dp_dos band.out dos_total.dat

This would create a file dos_total.dat in NXY format, with the energies as X-values and the calculated DOS values as Y-values. You can tune the output by setting different options for dp_dos. Invoke it with the help option:

dp_dos -h

shows detailed information about possible options. The results can be visualised with xmgrace, for example, with the commands:

xmgrace -nxy dos_total.dat

and by zooming into the region around the Fermi-level (showing the valence band edge and the conduction band edge), you should obtain a picture like this:

DOS of TiO2 anatase as calculated by DFTB+.

In order to investigate the nature of the states forming the valence and conduction band edges, we will then plot the contribution of the individual atomic shells to the band edges. For that, we have to convert the PDOS-files into NXY files. In the case of dos_ti.1.dat you would execute:

dp_dos -w dos_ti.1.out dos_ti.s.dat

and similarly for the other PDOS files. It is important that you specify the weighting option -w for the PDOS files, as otherwise the total DOS (instead of the appropriate PDOS) will be created in each case. By visualizing the obtained data files together with the total DOS, you should obtain a picture like:

DOS and PDOS of TiO2 anatase as calculated by DFTB+.

Here you can see that the valence band edge of anatase is entirely composed of the oxygen p-orbitals, while the conduction band edge is made of the d-orbitals of titanium.

Calculating the band structure

[Input: recipes/basics/bandstruct/2_bands/]

Once well converged charges for a system have been obtained, the band structure can then be calculated at any chosen k-point. In our case, we will choose the points lying along a line which goes through the high symmetry points, Z-Gamma-X-P, of the anatase Brillouin zone. In order to do that, the input has to be changed slightly:

# ...

Hamiltonian = DFTB {
  Scc = Yes
  ReadInitialCharges = Yes
  MaxSCCIterations = 1

  # ...

  KPointsAndWeights = Klines {
    1   0.5   0.5  -0.5    # Z
   20   0.0   0.0   0.0    # G
   45   0.0   0.0   0.5    # X
   10   0.25  0.25  0.25   # P
  }
}

# ...

Note: only the relevant parts of the input are shown, here. See the Introduction section on how to obtain the archive with the full input.

The input is (must be) almost the same as in the previous case, with only a few adaptions:

  • As we want to use the charges, as obtained in the previous well converged calculation, you have to copy the charges.bin file from the previous calculation into the directory of the current calculation. At the same time, you must instruct the code to read those charges, by setting:

    ReadInitialCharges = Yes
    
  • Since we want to use the well converged charges to obtain the band structures and do not want to change them during the calculation, the maximal number of SCC cycles should be set to 1:

    MaxSCCIterations = 1
    
  • Finally, the k-points should be adapted according to the lines in the Brillouin-zone, along which you wish to obtain the band structure. You can achieve that by using the Klines directive:

    KPointsAndWeights = Klines {
      1   0.5   0.5  -0.5    # Z
     20   0.0   0.0   0.0    # G
     45   0.0   0.0   0.5    # X
     10   0.25  0.25  0.25   # P
    }
    

    Every line of this block specifies a line segment. The first column gives the number of k-points along the line segment between (but excluding) the end of the previous line segment and the k-point which is specified as the next three columns (which is the end point of the current line segment). The specified number of k-points are evenly distributed along the line segment, with the last k-point coincident with the end point of the segment. The coordinates of the k-points are fractional coordinates (given in the coordinate system of the reciprocal lattice vectors of the periodic structures).

    The starting point of the first line segment is by default the Gamma point, but you can override this behaviour by setting a first line segment with one point only, as demonstrated above for the Z-point.

    Running DFTB+ with the input above, the eigenlevel spectrum is calculated at the required k-points. The results are written to the file band.out. You can use the script dp_bands from the dptools package to convert this file into XNY format. By issuing:

    dp_bands band.out band
    

    you would then obtain a file band_tot.dat containing the band structures. After plotting it, you should see something like:

    Band structure of TiO2 anatase as calculated by DFTB+.

    Note, DFTB+ enumerates the k-points along the lines you specified starting at one. The vertical bars corresponding to the special points \(Z\), \(\Gamma\), \(X\) and \(P\) must be therefore inserted on positions 1, 21, 66, 76.

First steps with Waveplot

[Input: recipes/basics/waveplot/]

Waveplot is a tool for generating grid-based volumetric data for charge distributions or wave-functions in your system. By visualising those distributions with appropriate graphical tools you can obtain a deeper understanding of the physics and chemistry of your quantum mechanical system.

Making a DFTB+ calculation

In order to plot the charge distribution or the orbitals in a certain system, you have to execute a DFTB+ calculation for this system first. The calculation must be executed as usual, you just have to make sure, that the options WriteDetailedXML and WriteEigenvectors are turned on.

Below you see the input for the H2O molecule, where the geometry is optimised by DFTB+:

Geometry = GenFormat {
3  C
 O H
     1    1    0.00000000000E+00  -0.10000000000E+01   0.00000000000E+00
     2    2    0.00000000000E+00   0.00000000000E+00   0.78306400000E+00
     3    2    0.00000000000E+00   0.00000000000E+00  -0.78306400000E+00
}

Driver = ConjugateGradient {
  MovedAtoms = 1:-1
  MaxForceComponent = 1.0e-4
  MaxSteps = 100
  OutputPrefix = "geom.out"
}

Hamiltonian = DFTB {
  Scc = Yes
  SccTolerance = 1.0e-5
  SlaterKosterFiles = Type2FileNames {
    Prefix = "../../slakos/mio-ext/"
    Separator = "-"
    Suffix = ".skf"
  }
  MaxAngularMomentum = {
    O = "p"
    H = "s"
  }
}

Options {
  WriteDetailedXml = Yes
  WriteEigenvectors = Yes
}

ParserOptions {
  ParserVersion = 7
}

Running DFTB+ for this input, you should obtain the usual results, and additionaly the files detailed.xml and eigenvec.bin. Former contains some information about the calculated system, latter contains the obtained eigenvectors in binary format. Both files are needed by waveplot.

Running Waveplot

Now, you have to decide, what kind of charge distributions, wavefunctions etc. to plot. In the current example, we will plot the total charge distribution of the water molecule, the charge distribution (wavefunction squared) for the highest occupied molecular orbital (HOMO), the wave function for the HOMO, and the total charge difference, which tells us, how the chemical bonding between the atoms modified the total charge distribution compared to the superpositions of neutral atomic densities.

Input

The appropriate waveplot input (waveplot_in.hsd) could look like the following:

# General options

Options {
  TotalChargeDensity = Yes           # Total density be plotted?
  TotalChargeDifference = Yes        # Total density difference plotted?
  ChargeDensity = Yes                # Charge density for each state?
  RealComponent = Yes                # Plot real component of the wavefunction
  PlottedSpins = 1 -1
  PlottedLevels = 4                  # Levels to plot
  PlottedRegion =  OptimalCuboid {}  # Region to plot

  NrOfPoints = 50 50 50              # Number of grid points in each direction
  NrOfCachedGrids = -1               # Nr of cached grids (speeds up things)
  Verbose = Yes                      # Wanna see a lot of messages?
}

DetailedXml = "detailed.xml"         # File containing the detailed xml output
                                     # of DFTB+
EigenvecBin = "eigenvec.bin"         # File cointaining the binary eigenvecs


# Definition of the basis
Basis {
  Resolution = 0.01
  # Including mio-1-1.hsd. (If you use a set, which depends on other sets,
  # the wfc.*.hsd files for each required set must be included in a similar
  # way.)
  <<+ "../../slakos/wfc/wfc.mio-1-1.hsd"
}

Some notes to the input:

  • Option TotalChargeDensity controls the plotting of the total charge density. If turned on, the file wp-abs2.cube is created.

  • Option TotalChargeDifference instructs Waveplot to plot the difference between the actual total charge density and the density you would obtain by summing up the densities of the neutral atoms.

  • Option ChargeDensity tells the code, that the charge distribution for some orbitals (specified later) should be plotted. Similarly, RealComponent instructs Waveplot to create cube files for the real part of the one-electron wavefunctions for the specified orbitals. (For non-periodic systems the wavefunctions are real.)

  • Options PlottedSpins, PlottedLevels (for periodic systems also PlottedKPoints) controls the levels (orbitals) to plot. In the current example we are plotting level 4 (is the HOMO of the water molecule) for all available spins. Since the DFTB+ calculation was spin unpolarised, we obtain only one plot for the HOMO in file wp-1-1-4-abs2.cube (1-1-4 in the file name indicates first K-point, first spin, 4th level).

  • The region to plot is selected with the option PlottedRegion. Instead of specifying the box origin and box dimensions by hand, Waveplot can be instructed by using the OptimalCuboid method to take the smallest cuboid, which contains all the atoms and enough space around them, so that the wavefunctions are not leaking out of it. (For details and other options for PlottedRegion please consult the manual.) The selected region in the example is sampled by a mesh of 50 by 50 by 50. (NrOfPoints)

  • The basis defintion (Basis) is made by including the file containing the appropriate wave function coefficient definitions. You must make sure that you use the file for the same set, which you used during your DFTB+ calculation. Here, the mio-1-1 set was used for calculating the H2O molecule, and therefore the file wfc.mio-1-1.hsd is included.

    The wavefuntion coefficients can be usually downloaded from the same place as the Slater-Koster files.

Output
================================================================================
     WAVEPLOT  0.2
================================================================================

Interpreting input file 'waveplot_in.hsd'
--------------------------------------------------------------------------------
WARNING!
-> The following 3 node(s) had been ignored by the parser:
(1)
Path: waveplot/Basis/C
Line: 1-33 (File: wfc.mio-0-1.hsd)
(2)
Path: waveplot/Basis/N
Line: 52-84 (File: wfc.mio-0-1.hsd)
(3)
Path: waveplot/Basis/S
Line: 120-170 (File: wfc.mio-0-1.hsd)

Processed input written as HSD to 'waveplot_pin.hsd'
Processed input written as XML to 'waveplot_pin.xml'
--------------------------------------------------------------------------------

Doing initialisation

Starting main program

Origin
  -5.00000 -6.35306 -6.47114
Box
  10.00000 0.00000 0.00000
  0.00000 11.08472 0.00000
  0.00000 0.00000 12.94228
Spatial resolution [1/Bohr]:
  5.00000 4.51071 3.86331

Total charge of atomic densities:    7.981973


 Spin KPoint  State  Action        Norm   W. Occup.
    1      1      1    read
    1      1      2    read
    1      1      3    read
    1      1      4    read

Calculating grid

    1      1      1    calc    0.996855    2.000000
    1      1      2    calc    1.003895    2.000000
    1      1      3    calc    0.998346    2.000000
    1      1      4    calc    1.000053    2.000000
File 'wp-1-1-4-abs2.cube' written
File 'wp-1-1-4-real.cube' written
File 'wp-abs2.cube' written

Total charge:    7.998297

File 'wp-abs2diff.cube' written

================================================================================

Some notes on the output:

  • The warnings about unprocessed nodes appears, because the included file wfc.mio-0-1.hsd also contained wave function coefficients for elements (C, N, S), which are not present in the calculated system. Hence these extra definitions in the file were ignored.

  • The Total charge of atomic densities tells you the amount of charge found in the selected region, if atomic densities are superposed. This number should be approximately equal to the number of electrons in your system (here 8). There could be two reasons for a substantial deviation. Either the grid is not dense enough (option NrOfPoints) or the box for the plotted region is too small or misplaced (PlottedRegion).

  • The output files for the individual levels (charge density, real part, imaginary part) follow the naming convention wp-KPOINT-SPIN-LEVEL-TYPE.cube.

    The total charge and the total charge difference are stored in the files wp-abs2.cube and wp-abs2diff.cube, respectively.

Visualising the results

The volumetric data generated by Waveplot is in the Gaussian cube format and can be visualized with several graphical tools (VMD, JMol, ParaView, …). Below we show the necessary steps to visualize it using VMD. (It refers to VMD version 1.8.6 and may differ in newer versions.)

Total charge distribution

The cube file containing the total charge distribution wp-abs2.cube can be read by using the File|New Molecule menu. VMD should automatically recognise, that the file has the Gaussian cube format. After successful loading, the VMD screen shows the skeleton of the molecule.

In order to visualise the charge distribution, the graphical representation of the molecule has to be changed. This can be achieved by using the Graphics|Representations... submenu. The skeleton representation can be turned to a CPK represenation (using balls and sticks) by selecting CPK for the Drawing method in the Graphical Representations dialog box. Then you should create an additional representation (Create Rep) and change the drawing method for it to be Isosurface. The type of isosurface (Draw) should be changed from Points to Solid Surface and instead of Box+Isosurface only Isosurface should be selected. Then, by tuning the Isovalue one can select the isosurface to be plotted. Figure 1 was created using 0.100. (Display background color had been set to white using the Graphics|Colors menu.)

H2O density

Total charge density for the H2O molecule, created by Waveplot, visualised by VMD.

Charge distribution difference

The charge distribution difference can be plotted in a similar way as the total charge. One has to load the file wp-abs2diff.cube. One should then, however, make not one, but two additional graphical representations of the type Isosurface. One of them should have positive isovalue, the other one a negative one. The different isosurfaces can be colored in a different way by using ColorID as coloring method and choosing different color values for the different representations.

Figure 2 demonstrates this for the water molecule. Negative net populations were colored red, positive net populations blue. One can clearly see, that there is a significant electron transfer from the hydrogens to the oxygen (lone pair on the oxygen).

H2O density difference

Charge density difference (total density minus sum of atomic densities) for the H2O molecule, as created by Waveplot and visualised by VMD.

Molecular orbitals

The plotting of molecular orbitals can be, depending which property is plotted, done in the same way as the total charge distribution or the total charge difference. If the charge density (probability distribution) of an orbital is plotted, the data contains only positive values, therefore only one isosurface representation is necessary (like for the charge distribution). If the real (or for periodic systems also the imaginary) part of the wavefunction is to be plotted, two isosurface representations are needed, one for the positive and one for the negative values (like for the charge difference).

Figure 3 shows the distribution of the electron (wavefunction squared) for the HOMO, while Figure 4 shows the HOMO wavefunction itself (blue - positive, red - negative). You can easily recognise the p-type of the HOMO, positive on one side, negative on the other side, a node plane in the middle.

H2O homo density

Highest occupied molecular orbital of a water molecule (wavefunction square)

H2O homo real

Highest occupied molecular orbital of a water molecule (real part of the wavefunction).

Molecular dynamics

This chapter discusses DFTB+ calculations where the (classical) dynamics of the atoms are simulated in different thermodynamical ensembles and with different methods of integrating the equations of motion.

Molecular dynamics can be used to provide information which is not easily accessible from static simulations. This includes properties such as anharmonic vibrational modes, the stability of different structures at finite temperatures, or to explore the range of structural minima the system can adopt.

Molecular dynamics

Classical molecular dynamics treats the position and motion of the atoms (the ion cores) as with Newtonian dynamics. Often this is with modifications to include the effect of contact with a thermal reservoir (NVT) or additionally treats the system as though it is in a piston that controls the pressure (NPT).

DFTB+ has several options for basic molecular dynamics, but also can use an accelerated form of SCC-DFTB which has similar performance properties to non-SCC for suitable systems.

The basic method adopted for molecular dynamics in DFTB+ is the Velocity Verlet integrator (Probably more correctly called the Newton-Stormer-Verlet method, as its been independently discovered several times). This is an implicit symplectic integrator, which is good at preserving the total energies in the NVE ensemble for separable Hamilton systems with energy of the form \(H(p, q) = T(p) + U(q)\) where \(p\) and \(q\) are conjugate momenta and positions of the atoms.

The form of the integrator is an update for the positions at step \(n+1\):

\(x_{n+1} = x_n + \tau v_n + \tau^2 F_n / 2 m\),

followed by evaluation of the forces, \(F_{n+1}\) at that geometry and then an update for the atomic velocities:

\(v_{n+1} = v_n + \tau \left[ F_{n+1} + F_{n} \right] / 2 m\).

The user defined parameter \(\tau\) controls the accuracy of the integration, typically this should be chosen to be somewhat shorter than the fastest vibrational period of the system. For many systems, \(\tau \approx 1\) fs is reasonably stable, but this should of course be tested for the particular case you are interested in.

Preparing for an MD calculation

The initial structure for starting a molecular dynamics simulation should usually be fully structurally relaxed. This is done to remove any excess potential energy, which would otherwise rapidly convert into kinetic energy (so heating up the system).

This example relaxes the geometry of a molecule (PTCDA) to a local miniumum, prior to performing molecular dynamics.

After running the relaxation, the output file geo_end.gen is now suitable to use as a starting structure for MD (or for other property calculations).

Vibrational modes

Once at a structural minimum, the quasi-harmonic vibrational modes of the atoms in the system can be calculated. These can then be compated with the power spectrum of the system, as determined by molecular dynamics.

The vibrational modes can be obtained from the mass-weighted Hessian matrix of the system. This can be calculated by DFTB+ from finite difference second derivatives of the energy with respect to atom positions (technically, the first derivatives of the forces are actually used).

This derivative calculation can be enabled as an alternative choice of the Driver:

Driver = SecondDerivatives {
    Delta = 1E-4
}

The provided example also sets the size of the finite difference step used to differentiate the energy as 0.0001 atomic units.

The modes are known as quasi-harmonic, since depending on the numerical step size these derivatives will include a contribution from higher even derivatives of the function. Hence, for accurate evaluation of harmonic frequencies, the smallest stable choice of the step size should be used (however, a carefull choice of this value can sometimes be used to include anharonicity in the calculated vibrational energies, see for example Jones and Briddon 1998).

Running the code will produce a file called hessian.out which contains the derivative information.

Calculating the modes

The modes code can read the Hessian matrix and calculate the vibrational modes. An example input file is included in this directory using the hessian.out file that is produced by the DFTB+ calculation using the supplied dftb_in.hsd.

The general structure of the input for modes is similar to DFTB+, also being in the hsd format. The code expects a modes_in.hsd file, which contains blocks for the geometry, Hessian and the Slater-Koster files (used to get the mass of the atoms). A typical input looks like:

# Needs the equilibrium geometry, at which the Hessian had been calculated
Geometry = GenFormat {
   <<< geom.gen
}

DisplayModes = {
  PlotModes = -20:-1 # Take the top 10 modes
  Animate = Yes      # make xyz files showing the atoms moving
}

# You need to specify the SK-files, as the mass of the elements is needed
SlaterKosterFiles = Type2FileNames {
  Prefix = "../../slakos/mio-ext/"
  Separator = "-"
  Suffix = ".skf"
}

# Include the Hessian, which was calculated by DFTB+
Hessian = {
  <<< "hessian.out"
}

# This file uses the 3rd input format of the modes code
InputVersion = 3

Here, the code produces animations of several modes, which can be viewed for example using the jmol cross platform viewer

jmol mode_114.xyz &

for the highest frequency stretch mode of this molecule.

Dynamics in the ground state

Dynamics on the Born-Oppenheimer ground state energy surface can be performed in DFTB+ by setting the input geometry driver to be VelocityVerlet

Driver = VelocityVerlet{
  TimeStep [fs] = 1.0
  Thermostat = NoseHoover {
    Temperature [Kelvin] = 400
    CouplingStrength [cm^-1] = 3200
  }
  Steps = 20000
  MovedAtoms = 1:-1
  MDRestartFrequency = 100
}

The velocity Verlet driver should have a time step on the scale of ~10x the highest vibrational period in the system. 1 fs is a common choice. The

The input file specifies initial velocities of the atoms

Velocities [AA/ps] {
  0.63060001     10.71652407      0.41599521
  -4.78167517     -0.67726160      6.81193886
  .
  .
}

Thermalising a system

The initial velocities of atoms can be user supplied, however it is more common to generate them by thermalising the system starting from an initial Maxwell-Boltzmann distribution of atomic velocities. These can be generated for example by using the following input:

Thermostat = NoseHoover {
  # Target temperature
  Temperature [Kelvin] = 400
  # Approximately the highest vibrational frequency of the molecule
  CouplingStrength [cm^-1] = 3200
}

within the VelocityVerlet input block. The initial values of the velocities are set from a random number generator, hence to make the calculation repeatable this is set to a specific value

Options = {
  RandomSeed = 3871906
}

However, for real calculations, it would be common to use a fully random choice, by omitting the RandomSeed value.

Speeding up SCC MD

Instead of standard Born-Oppenheimer (BO) molecular dynamics, there are several alternative methods that can propagate both the atomic positions and electronic structure simultaneously, often with substantial improvements in computational speed.

These are either the Car-Parrinello scheme or extended Lagrangian (XL) Born-Oppenheimer dynamics. DFTB+ supports the second of these methods, which (when stable) produces SCC equivalent dynamics to conventional Born-Oppenheimer molecular dynamics but with similar (lower) costs to non-SCC DFTB.

This example uses the fast XL-BOMD method, with a single SCC step for each geometry once the calculation starts. Here the dynamics of an extended Lagrangian is used to predict the charges for each time step, but without using a self-consistent loop. Fast XL-BOMD is potentially less stable, so should only be used for systems which show good SCC behaviour (molecules without degeneracies or solids with wide band gaps).

To enable XL dynamics, the VelocityVerlet{} block should be modified to contain an extra section:

XlbomdFast {
    IntegrationSteps = 5
    Scale = 0.5
    TransientSteps = 10
}

This instructs the code to use 5 time steps to integrate the equations of motion (EOM) for the system with an initial 10 steps of Born-Oppenheimer dynamics before starting the XL dynamics. The initial BO steps lead to a more stable start for the XL dynamics. The scaling factor is used to also increase stability of the XL-EOMs, and should be in the range of (0..1], where the largest value which is stable should be used (this can only be determined by testing).

Due to the reduced requirements on SCC during XL-BOMD dynamics, the standard DFTB forces are usually not sufficiently accurate due to increased self-consistency errors. Hence an extra command should be added to the Hamiltonian {} block:

ForceEvaluation = Dynamics

This enables an additional correction in the forces, which can be used for Fermi fillings. This correction can also be used in other calculations where the forces are evaluated with limited self-consistency (for example to speed up geometry relaxation be relaxing the SCCTolerance).

In the special case of the electronic temperature of the system being set to be 0, there is a faster version of this correction which can be used:

ForceEvaluation = DynamicsT0

Analysing MD

MD simulations contain various information about the system. Fluctuations and correlations in the ground state encode information about excitations (fluctuation-dissipation theorem) and transport (Einstein relation and also the Green-Kubo relations).

The velocity-velocity auto-correlation encodes the (anharmonic) vibrational modes, while the dipole auto-correlation contains information on the mode intensities. Both of these can be evaluated from the output of the code - more information to follow on this topic.

Simulated annealing

By heating up a system, and then allowing it to slowly cool, stable minima can be found. If the cooling is sufficiently (logarithmically) slow, then the global minima will be found, however this is not usually practical. More realistically, several heating and cooling experiments can find some stable structures, or investigate the stability of known geometries.

The first example anneals away a Stone-Wales defect in a graphene sheet. This is acheived by heating the system up to a high (5000K) temperature where C-C bonds start to break, holding it at these temperatures for a while and then cooling to a low temperature. The high temperature is above the usual disociation temperature of graphene, but since only a relatively short (2.4 ps) time is computed, using a higher temperature accelerates the annealing.

Annealing to multiple minima

The annealing process can also be used to find alternative local minina. Here two vacancies in a graphene sheet are heated. Depending on the initial conditions, the system anneals to different final structures. The starting velocities are chosen at random, so depending on the seed value for the random generator (several different cases are given in the input file) a different final defect geometry is obtained by the same cycle of heating and cooling.

PLUMED2 integration and metadynamics

Metadynamics is a class of methods that are used to estimate free energy (and other state functions) of a system in cases where normal ergodic sampling is difficult due to the system’s energy landscape.

PLUMED2 is a molecular dynamics biasing and analysis package that has been interfaced with DFTB+ to give metadynamics and free energy sampling functionality. It can select a wide range of collective variables to sample systems, such as the distance between pairs of atoms or specified torsion angles.

To use PLUMED2, DFTB+ must be compiled with support for this enabled (and a compiled version of the plumed library available, see the plumed website). Set the Plumed tag in the Driver input block to Yes with VerlocityVerlet selected as the geometry driver. A plumed.dat input file must also be present in the run directory

Driver = VelocityVerlet{
 TimeStep [fs] = 1.0

 Plumed = Yes

 Thermostat = NoseHoover {
   Temperature [Kelvin] = 400
    CouplingStrength [cm^-1] = 3050
 }
.
.
.
}

The plumed.dat file contains information to be read by the PLUMED2 code that will allow it to either bias or analyse the molecular dynamics on the fly.:

DISTANCE ATOMS=4,9 LABEL=d1
DISTANCE ATOMS=5,9 LABEL=d2

METAD ...
LABEL=met ARG=d1,d2 PACE=100 HEIGHT=3 SIGMA=0.01,0.01
FILE=HILLS
BIASFACTOR=4 TEMP=400
... METAD

PRINT ARG=d1,d2 STRIDE=100 FILE=plumed_o.dat

ENDPLUMED

The first 2 lines of the example plumed.dat file tell PLUMED2 to define the distance between atoms 4 and 9 and the distance between atoms 5 and 9 and label them d1 and d2 respectively. the METAD tag tells plumed to read in parameters for a metadynamics bias calculation with d1 and d2 as input collective variables. This metadynamics simulation places a Gaussian hills function every 100 md steps. HEIGHT, SIGMA and BIASFACTOR are all metadynamics parameters while TEMP should correlate with the thermostat temperature. The Gaussian bias functions are written to the file specified by the FILE tag. The PRINT function tells PLUMED2 to print d1 and d2 every 100 steps to the file specified by the FILE tag. Finally, the ENDPLUMED tag indicates to PLUMED2 that input is complete.

More complete information on the functionality of PLUMED2 can be found in its user manual.

Graphene defect calculations

This chapter contains recipes for various electronic calculations of graphene nanoribbons. It demonstrates some defect calculation with plot of PDOS and wavefunctions. These examples can be helpful in order to understand the first application of transport calculations in chapter Electron transport.

Electronic structure of 2D carbon materials

Perfect graphene

First we will investigate some of the basic properties of the 2D graphene structure.

Geometry, density of states

[Input: recipes/defect/carbon2d-elect/graphene/latopt/]

Preparing the input

Graphene has a hexagonal lattice with two C atoms in its primitive unit cell, which is specified in the supplied GEN-formatted geometry file. Open the file geo.gen in a text editor. You should see the following content:

2  S
C
 1 1    0.1427557522E+01    0.0000000000E-00    0.0000000000E-00
 2 1   -0.1427557522E+01    0.0000000000E-00    0.0000000000E-00
 0.0000000000E+00    0.0000000000E+00    0.0000000000E+00
 0.2141036415E+01   -0.1236340643E+01    0.0000000000E-00
 0.2141036415E+01    0.1236340643E+01    0.0000000000E-00
 0.0000000000E-00    0.0000000000E-00    0.5000000000E+02
  • Since the structure is periodic, appropriate information for this boundary condition must be provided after the atomic coordinates. For a GEN file of type S, this is the cartesian coordinates of the origin followed by the 3 cartesian lattice vectors (one per line). DFTB+ uses three dimensional periodic boundary conditions. In order to separate the graphene sheets from each other and to prevent interaction between them, the third lattice vector, which is orthogonal to the plane of graphene, has been chosen to have a length of 50 angstroms.

Before running the code, you should check, whether the specified unit cell, when repeated along the lattice vectors, indeed results in a proper graphene structure. To repeat the geometry along the first and second lattice vectors a few times (the repeatgen script), convert it to XYZ-format (the gen2xyz script) and visualize it:

repeatgen geo.gen 4 4 1 > geo.441.gen
gen2xyz geo.441.gen
jmol geo.441.xyz &

You should then see a graphene sheet displayed, similar to Figure 5.

Graphene 4x4x1 sheet

4x4x1 graphene supercell

Now open the DFTB+ control file dftb_in.hsd. You should see the following options within it:

  • First we include the GEN-formatted geometry file, geo.gen, using the inclusion operator (<<<):

    Geometry = GenFormat {
       <<< "geo.gen"
    }
    
  • Then we specify the ConjugateGradient driver to optimize the geometry and also the lattice vectors. Since neither the angle between the lattice vectors nor their relative lengths should change during optimization, we carry out an isotropic lattice optimization:

    Driver = ConjugateGradient {
      LatticeOpt = Yes
      Isotropic = Yes
    }
    
  • Then the details of the DFTB hamiltonian follow:

    Hamiltonian = DFTB {
    
  • Within this block, we first specify the location of the parametrization files (the Slater-Koster files) and provide additional information about the highest angular momentum for each element (this information is not yet stored in the Slater-Koster-files):

    MaxAngularMomentum {
      C = "p"
    }
    SlaterKosterFiles = Type2FileNames {
      Prefix = "../../slako/"
      Separator = "-"
      Suffix = ".skf"
    }
    

    Please note, that the highest angular momentum is not a free parameter to be changed, but it must correspond to the value given in the documentation section of the correspoding homonuclear Slater-Koster-files (e.g. see the C-C.skf file for carbon).

  • We use the self-consistent charge approach (SCC-DFTB), enabling charge transfer between the atoms:

    SCC = Yes
    
  • As graphene is metallic we smear the filling function to achieve better SCC-convergence:

    Filling = Fermi {
      Temperature [Kelvin] = 100
    }
    
  • For the Brillouin-zone sampling we set our k-points according to the 48 x 48 x 1 Monkhorst-Pack sampling scheme. This contains those k-points which would be folded onto the k-point (0.5, 0.5, 0.0) of an enlarged supercell consisting of the primitive unit cell repeated by (48, 0, 0), (0, 48, 0) and (0, 0, 1). This can be easily specified with the SupercellFolding option, where one defines those supercell vectors followed by the target k-point.

    KPointsAndWeights = SuperCellFolding {
      48 0 0
      0 48 0
      0 0 1
      0.5 0.5 0.0
    }
    
  • We also want to do some additional analysis by evaluating the contributions of the s- and p-shells to the density of states (DOS). Accordingly, we instruct DFTB+ in the Analysis block to calculate the contribution of all C atoms to the DOS in a shell-wise manner (s and p) and store the shell-contributions in files starting with a prefix of pdos.C:

    Analysis {
      ProjectStates {
        Region {
          Atoms = C
          ShellResolved = Yes
          Label = "pdos.C"
        }
      }
    }
    
Running the code

When you run DFTB+, you should always save its output into a file for later inspection. We suggest using a construction like this (output is saved into the file output):

dftb+ | tee output

You will see that DFTB+ optimizes the geometry of graphene by changing the lattice vectors and ion coordinates to locally minimise the total energy. As the starting geometry is quite close to the optimum one, the calculation should finish almost immediately.

Apart from the saved output file (output), you will find several other new files created by the code:

dftb_pin.hsd
Contains the parsed user input with all the default settings for options which have not been explicitely set by the user. You should have look at it if you are unsure whether the defaults DFTB+ used for your calculation are appropriate, or if you want to know which other options you can use to adjust your calculation.
detailed.out
Contains detailed information about the calculated physical quantities (energies, forces, charges, etc.) obtained in the last SCC cycle performed.
band.out
Eigenvalues (in eV) and fillings for each k-point and spin channel.
charges.bin
Charges of the atoms at the last iteration, stored in binary format. You can use this file to restart a calculation with those atomic charges.
geo_end.xyz, geo_end.gen
Final geometry in both XYZ and GEN formats.
pdos.C.1.out, pdos.C.2.out
Output files containing the projected density of states for the first and second angular shells of carbon (in this case the 2s and 2p shells). Their format is similar to band.out.
Analysing results

The very first thing you should check is whether your calculation has converged at all to a relaxed geometry. The last line of the output file contains the appropriate message:

Geometry converged

This means that the program stopped because the forces on the atoms which are allowed to move (all of them in this example) were less than a given tolerance (specified in the option MaxForceComponent, which defaults to 1e-4 atomic units) and not instead because the maximal number of geometry optimization steps have been executed (option MaxSteps, default 200).

You should visualize the resulting structure using Jmol (or any other molecular visualization tool). You should probably repeat the geometry again to get a better idea how it looks like, as we did for the starting structure above. The distance between the C atoms should be very similar to those in the initial structure.

In order to visualize the density of states and the partial density of states, you should convert the corresponding human readable files (with prefix .out) to XY-format data

dp_dos band.out dos.dat
dp_dos -w pdos.C.1.out pdos.C.1.dat
dp_dos -w pdos.C.2.out pdos.C.2.dat

Please note the flag -w, which is mandatory when converting partial density of states data for plotting. You can obtain more information about various flags for dp_dos by issuing:

dp_dos -h

You can visualize the DOS and the PDOS for the s- and p-shells of carbon in one picture using the plotxy tool, which is a simple command line wrapper around the matplotlib python library (issue the command plotxy -h for help):

plotxy --xlabel "Energy [eV]" --ylabel "DOS" dos.dat pdos.C.1.dat pdos.C.2.dat &

You can use also any other program (gnuplot, xmgrace) which can visualize XY-data. You should see something similar to Figure 6.

DOS and PDOS for graphene

DOS and PDOS of graphene

The position of the Fermi level (at -4.67 eV) can be read out from the detailed.out file, either directly or by using an appropriate grep command:

grep "Fermi level" detailed.out

As expected for graphene, the DOS vanishes at the Fermi level. Around the Fermi level, all states are composed of the p-orbitals of the carbons, the s-orbitals only contribute to energeticaly much lower and much higher states. Also, one can observe the van-Hove-singularties. The wiggles at around 0 eV and at higher energy are artifacts. Using more k-points for the Brillouin-zone sampling or using a slightly wider broadening function in dp_dos would smooth them out.

Band structure

[Input: recipes/defect/carbon2d-elect/graphene/bands/]

Band structure calculations in DFTB (as in DFT) always consist of two steps:

  1. Calculating an accurate ground state charge density by using a high quality k-point sampling.
  2. Determining the eigenvalues at the desired k-points of the band structure, using the density obtained in the previous step. The density is not changed during this step of the band structure calculation.

Step 1 you just have executed, so you can copy the final geometry and the data file containing the converged charges from that calculation into your current working directory:

cp ../latopt/geo_end.gen .
cp ../latopt/charge.bin .

Have a look on the dftb_in.hsd file for the band structure calculation. It differs from the previous one only in a few aspects:

  • We use the end geometry of the previous calculation as geometry:

    Geometry = GenFormat {
      <<< "geo_end.gen"
    }
    
  • We need static calculation only (no atoms should be moved), therefore, no driver block has been specified.

  • The k-points are specified along specific high symmetry lines of the Brillouin-zone (K-Gamma-M-K):

    KPointsAndWeights = KLines {
      1    0.33333333  0.66666666 0.0    # K
     20    0.0  0.0  0.0                 # Gamma
     20    0.5  0.0  0.0                 # M
     10    0.33333333  0.66666666 0.0    # K
    }
    
  • We initialize the calculation with the charges stored during the previous run:

    ReadInitialCharges = Yes
    
  • We do not want to change the charges during the calculation, therefore, we set the maximum number of SCC cycles to one:

    MaxSCCIterations = 1
    

Let’s run the code and convert the band structure output to XY-format:

dftb+ | tee output
dp_bands band.out band

The dp_bands tool extracts the band structure from the file band.out and stores it in the file band_tot.dat. For spin polarized systems, the name of the output file would be different. Use:

dp_bands -h

to get help information about the arguments and the possible options for dp_bands.

In order to investigate the band structure we first look up the position of the Fermi level in the previous calculation performed with the accurate k-sampling

grep "Fermi level" ../latopt/detailed.out

which yields -4.67 eV, and then visualize the band structure by invoking

plotxy -L --xlabel "K points" --ylabel "Energy [eV]" band_tot.dat &

This results in the band structure as shown in Figure 7.

Band structure of graphene

Band structure of graphene

You can see the linear dispersion relations around the point K in the Brillouin-zone (k-points 0 and 51 in our circuit) which is a very typical characteristic of graphene.

Zigzag nanoribbon

Next we will study some properties of a hydrogen saturated carbon zigzag nanoribbon.

Calculting the density and DOS

[Input: recipes/defect/carbon2d-elect/zigzag/density/]

The initial geometry for the zigzag nanoribbon contains one chain of the structure, repeated periodically along the z-direction. The lattice vectors orthogonal to the periodicity (along the x- and y- axis) are set to be long enough to avoid any interaction between the repeated images.

First convert the GEN-file to XYZ-format and visualize it:

gen2xyz geo.gen
jmol geo.xyz &

Similar to the case of perfect graphene, you should check first the initial geometry by repeating it along the periodic axis (the third lattice vector in this example) and visualize it. The necessary steps are collected in the file checkgeo.sh. Please have a look at its content to understand what will happen, and then issue

./checkgeo.sh

to obtain the molecule shown in Figure 8.

Band structure of graphene

Section of an H-saturated zigzag nanoribbon

The control file dftb_in.hsd is similar to the previous examples, with a few differences only:

  • We use the 1 x 1 x 24 Monkhorst-Pack k-point set to sample the Brillouin-zone, since the ribbon is only periodic along the direction of the third lattice vector. The two other lattice vectors have been choosen to be long enough to avoid interaction between the artificially repeated ribbons.:

    KPointsAndWeights = SupercellFolding {
      1 0 0
      0 1 0
      0 0 24
      0.0 0.0 0.5
    }
    
  • In order to analyze, which atoms contribute to the states around the Fermi level, we create four projection regions containing the saturating H atoms, the C atoms in the outermost layer of the ribbon, the C atoms in the second outermost layer and finally the C atoms in the third outermost layer, respectively. Since the ribbon is mirror symmetric, we include the corresponding atoms on both sides in each projection region:

    ProjectStates {
    
      # The terminating H atoms on the ribbon edges
      Region {
        Atoms = H
        Label = "pdos.H"
      }
    
      # The surface C atoms
      Region {
        Atoms  = 2 17
        Label = "pdos.C1"
      }
    
      # The next row of C atoms further inside
      Region {
        Atoms = 3 16
        Label = "pdos.C2"
      }
    
      # Some more 'bulk-like' C atoms even deeper
      Region {
        Atoms = 4 15
        Label = "pdos.C3"
      }
    }
    

You can run the program and convert the output files by issuing:

./run.sh

When the program has finished, look up the Fermi level and visualize the DOS and PDOS contributions. The necessary commands are collected in showdos.sh:

./showdos.sh

When you zoom into the area around the Fermi level (-4.57 eV), you should obtain something like Figure 9.

DOS of zigzag nanoribbon

DOS of the zigzag nanoribbon around the Fermi energy

You can see that the structure is clearly metallic (displaying a non-zero density of states at the Fermi energy). The states around the Fermi level are composed of the orbitals of the C atoms in the outermost and the third outermost layer of the ribbon. There is no contribution from the C atom in the layer in between or from the H atoms to the Fermi level.

Band structure

[Input: recipes/defect/carbon2d-elect/zigzag/bands/]

Now let’s calculate the band structure of the zigzag nanoribbon. The commands are in the script run.sh, so just issue:

./run.sh

You will see DFTB+ finishing with an error message

ERROR!
-> SCC is NOT converged, maximal SCC iterations exceeded

Normally, it would mean that DFTB+ did not manage to find a self consistent charge distribution for its last geometry. In our case, however, it is not an error, but the desired behaviour. We have specified in dftb_in.hsd the options

ReadInitialCharges = Yes
MaxSCCIterations = 1

requiring the program to stop after one SCC iteration. The charges are at this point not self consistent with respect to the k-point set used for sampling the band structure calculation. However, k-points along high symmetry lines of the Brillouin-zone, as used to obtain the band structures, usually represent a poor sampling. Therefore a converged density obtained with an accurate k-sampling should be used to obtain the eigenlevels, and no self consistency is needed.

To look up the Fermi level and plot the band structure use the commands in showbands.sh:

./showbands.sh

You should obtain a band structure similar to Figure 10.

Band structure of the zigzag nanoribbon

Band structure of the zigzag nanoribbon

Again, one can see, that there are states around the Fermi-energy, so the nanoribbon is metallic.

Perfect armchair nanoribbon

We now investigate a hydrogen saturated armchair carbon nanoribbon, examining both the perfect ribbon and two defective structures, each with a vacancy at a different position in the ribbon. In order to keep the tutorial short, we will not relax the vacancies, but will only remove one atom from the perfect structure.

Total energy and density of state

[Input: recipes/defect/carbon2d-elect/armchair/perfect_density/]

The steps to calculate the DOS of the perfect H-saturated armchair nanoribbon are the same as for the zigzag case. First check the geometry with the help of repeated supercells:

./checkgeo.sh

You will see a repeated image of the perfect armchair nanoribbon unit cell (Figure 11).

Perfect armchair nanoribbon geometry.

Perfect armchair nanoribbon unit cell

The edge of the ribbon is visually different from the zigzag case. As it turns out, this also has some physical consequences. Let’s calculate the electronic density and extract the density of states:

./run.sh

If you look up the calculated Fermi level and then visualize the DOS

./showdos.sh

you can immediately see ( Figure 12) that there are no states around the Fermi-energy (-4.4 eV), i.e. the investigated armchair nanoribbon is non-metallic.

DOS of the perfect armchair nanoribbon

DOS of the perfect armchair nanoribbon

Band structure

[Input: recipes/defect/carbon2d-elect/armchair/perfect_bands/]

Let’s have a quick look at the band structure of the armchair H-saturated ribbon. The steps are the same as for the zigzag case, so just issue:

./run.sh
./showbands.sh

You should obtain a band structure like in Figure 13. You can read off the position of the band edges, when you zoom into the energy region around the gap: The valence band edge and the conduction band edge are in the Gamma point at -4.7 and -4.2 eV, respectively. You can also easily extract this information from the band.out file, where the occupation goes from nearly 2.0 to nearly 0.0 in the first k-point (the Gamma point).

Band structure of perfect armchair nanoribbon.

The band structure of the perfect hydrogen passivated armchair nanoribbon. The Fermi energy is at -4.4 eV.

Armchair nanoribbon with vacancy

Density and DOS

[Input: recipes/defect/carbon2d-elect/armchair/vacancy1_density/, recipes/defect/carbon2d-elect/armchair/vacancy21_density/]

As next, we should investigate two armchair nanoribbons with a vacancy in each. The inputs can be found in the corresponding directories and you can visualize both with the command

./showgeom_v12.sh

As you can see on Figure 14 and Figure 15, the vacancy is in the two cases on different sublattices.

Armchair nanoribbon geometry with vacancy (structure 1)

Armchair nanoribbon with vacancy (structure 1)

Armchair nanoribbon geometry with vacancy (structure 2)

Armchair nanoribbon with vacancy (structure 2)

The two vacancies (structures 1 and 2) are located on different sublattices. Since the geometries are periodic along the z-direction, the defects are also repeated. As we would like to calculate a single vacancy, we have to make our unit cell for the defect calculation large enough to avoid significant defect-defect interactions. In this case, the defective cells contain twelve unit cells.

In order to calculate the electron density of both vacancies, issue:

./run_v12.sh

This will take slightly longer than the previous calculations, since each system contains more than four hundred atoms.

We want to analyse the density of states of the two different vacancies, together with that of the defect-free system. The commands necessary to extract the DOS of all three configurations and show them in one figure have been stored in the script showdos_perf_v12.sh. Execute it

./showdos_perf_v12.sh

to obtain a figure like Figure 16.

DOS of armchair nanoribbons without and with vacancy.

The DOS of the perfect nanoribbon is indicated by solid blue line, the DOS of the nanoribbons with vacancies with green and red lines, respectively.

As you can see, in contrast to the zigzag nanoribbon, the perfect armchair nanoribbon is insulating as it has no states around the Fermi-energy (-4.45 eV). The structures with vacancies, on the other hand, introduce dangling (unsaturated) bonds, leading to unoccupied states around the Fermi-energy. We can also see, that the defects affect the band edges, which are shifted with respect to their position in the perfect structure. It also seems that the valence band edge is more affected than the conduction band edge, and in the case of vacancy 2 (red line) the effect is significantly larger than for vacancy 1 (green line).

Vacancy formation energy

You should also be able to calculate the formation energies of the two vacancies. The formation energy \(E_{\text{form}}\) of the vacancy in our case can be calculated as

\[E_{\text{form}} = \left( E_{\text{vac}} + E_{\text{C}} \right) - 12 \times E_{\text{perf}}\]

where \(E_{\text{vac}}\) is the total energy of the nanoribbon with the vacancy present, \(E_{\text{C}}\) is the energy of a C atom in its standard phase and \(E_{\text{perf}}\) is the energy of the perfect nanoribbon. Since the defective nanoribbons contain 12 unit cells of the perfect one, the energy of the perfect ribbon unit cell has to be multiplied by twelve. As a standard phase of carbon, we will take perfect graphene for simplicity. The energy of the C atom in its standard phase is then obtained by dividing the total energy of the perfect graphene primitive unit cell by two. (Look up this energy from detailed.out in the directory elect/graphene/density.) By calculating the appropriate quantities you should obtain ~8.5 eV for the formation energy of both vacancies. This is quite a high value, but you should recall that the vacancies have not been structurally optimised, and their formation energies are therefore, significantly higher than for the relaxed configurations.

Defect levels

[Input: recipes/defect/carbon2d-elect/armchair/vacancy2_wf/]

Finally we should identify the localised defect levels for vacancy 2 and plot the corresponding one-electron wavefunctions.

The vacancy was created by removing one C atom, which had three first neighbors. Therefore, three \(\text{sp}^2\) type dangling bonds remain in the lattice, which will then form some linear combinations to produce three defect levels, which may or may not be in the band gap. The DOS you have plotted before, indicates there are indeed defect levels in the gap, but due to the smearing it is hard to say how many they are.

We want to investigate the defect levels at the Gamma point, as this is where the perfect nanoribbon has its band edges. We will therefore do a quick Gamma-point only calculation for vacancy structure 2 using the density we obtained before. We will set up the input to write out also the eigenvectors (and some additional information) so that we can plot the defect levels with waveplot later. This needs the following additional settings in dftb_in.hsd:

Options {
  WriteEigenvectors = Yes
  WriteDetailedXML = Yes
  WriteDetailedOut = No
}

To just run the calculation

./run.sh

and open the band.out file. You will see, that you have three levels (levels 742, 743 and 744 at energies of -4.51, -4.45 and -4.45 eV, respectively) which are between the energies of the band edge states of the perfect ribbon. We will visualize those three levels by using the waveplot tool.

Waveplot reads the eigenvectors produced by DFTB+ and plots real space wavefunctions and densities. The input file waveplot_in.hsd can be used to control which levels and which region waveplot should visualize, and on what kind of grid. In the current example, we will project the real part of the wavefunctions for the levels 742, 743 and 744. In order to run Waveplot, enter:

waveplot | tee output.waveplot

The calculation could again take a few minutes. At the end, you should see three files with the .cube prefix, containing the volumetric information for the three selected one-electron wavefunctions.

We will use Jmol to visualize the various wavefunction components. Unfortunately, the visualization of iso-surfaces in Jmol needs some scripting. You can find the necessary commands in the files show*.js. You can either type in these commands in the Jmol console (which should be opened via the menu File | Console…) or pass it to Jmol using the -s option at start-up. For the case latter you will find prepared commands to visualize the various orbitals in the files

./showdeflev1.sh
./showdeflev2.sh
./showdeflev3.sh

Looking at the defect levels, you can see that the defect level lowest in energy (742) has a significant contribution on the atoms around the defect, but also a non-negligible delocalized part smeared over almost all atoms in the system. Apparently a localized defect level has hybridized with the delocalized valence band edge state, resulting in a mixture between localized and non-localized state. The other two defect levels, on the other hand, have wavefunctions which are well localized on the atoms around the vacancy site. Note that in accordance with the overall symmetry of the system, the defect levels are either symmetric or antisymmetric with respect to the mirror plane in the middle of the ribbon.

Wavefunction of the lowest defect level

Wavefunction of the lowest defect level of the hydrogen saturated armchair nanoribbon with a vacancy. Blue and red surfaces indicate isosurfaces at +0.02 and -0.02 atomic units, respectively.

Wavefunction of the second defect level

Wavefunction of the second lowest defect level of the hydrogen saturated armchair nanoribbon with a vacancy. Blue and red surfaces indicate isosurfaces at +0.02 and -0.02 atomic units, respectively.

Wavefunction of the lowest defect level

Wavefunction of the highest defect level of the hydrogen saturated armchair nanoribbon with a vacancy. Blue and red surfaces indicate isosurfaces at +0.02 and -0.02 atomic units, respectively.

Electronic dynamics

This chapter discusses DFTB+ calculations where electrons are allowed to move (evolve in time). This allows the calculation, for example, of electronic absorption spectra if the electrons are initially perturbed from the ground-state by a Dirac delta pulse. More general external time dependent fields can also be applied, including arbitrary elliptical polarisation and pulses with various envelope functions.

Ehrenfest dynamics can also be performed, where both the electrons and ions move. In this methods nuclei are driven by the instantaneous expectation value of the force at each time for the moving electron density.

In this chapter we will provide example calculations of electronic spectra, driving of the electronic dynamics using external pulsed and continuous fields and driving of nuclear motion via electronic excitation.

Electronic dynamics

Electron nuclear dynamics within the Ehrenfest ansatz is based on the integration of the following equation of motion (EOM) for the reduced one body density matrix:

()\[\dot{\rho} = \underbrace{-\mathrm{i} (S^{-1} H \rho - \rho H S^{-1})}_\text{electronic terms} - \underbrace{(S^{-1} D \rho + \rho D^\dagger S^{-1})}_\text{non-adiabatic terms}\]

As noted above the EOM contains a purely electronic driving term in which the dynamics may be generated, either from an external time dependent driving field in \(H\) or from the fact that the density matrix might be in a non-equilibrium state and therefore does nos commute with \(H\).

The second term in equation (1) is part of the Ehrenfest ansatz and includes the non adiabatic coupling matrix \(D\), the elements of which are defined as \(D_{\kappa \nu} = \langle \phi_\kappa | \dot{\phi_\nu} \rangle\), where \(| \phi_\kappa \rangle\) and \(| \phi_\nu \rangle\) are a pair of atomic orbitals and, in a localised basis set such as the one used in DFTB+, the time derivative of an atomic orbital with respect to time is a function of the velocity of the nuclei to which it is attached. The non-adiabatic term introduces a mechanism by which nuclear motion can induce electronic transitions.

The forces used to propagate the nuclei are calculated from the instantaneous expectation value of the force operator adding the corresponding self consistent and repulsive terms. This force comes from a non equilibrium density matrix and is the form in which an excited density matrix may induce nuclear motion.

The EOM for the density matrix is integrated using a Leapfrog scheme in which the density matrix at time \(t_{i+i}\) is obtained from its value a t time \(t_{i-1}\) and its derivative at time \(t_i\):

\(\rho_{i+1}=\rho_{i-1}+2\Delta t \dot{\rho}_i.\)

The nuclear motion is integrated using a velocity Verlet algorithm.

The propagation of the electronic and nuclear dynamics may be used for the calculation of absorption spectra, to study the response to constant or pulsed excitation and even the simulation of pump-probe spectroscopy.

All input related to electronic dynamics is located within the ElectronDynamics block of the input file. Electron dynamics is run after ground state self consistency is achieved, with the possibility of reading previously converged charges for the ground state or the state of the time-evolving density matrix to continue time propagation.

Calculation of electronic absorption spectra

This section discusses the calculation of electronic absorption spectra using the real time propagation of electronic dynamics as implemented in DFTB+.

Unless some good reason exists for not doing so, the electronic spectrum should be calculated at the equilibrium geometry. For this example we will use an optimised chlorophyll a molecule. This example reproduces the results in Oviedo, M. B., Negre, C. F. A., & Sánchez, C. G. (2010). Dynamical simulation of the optical response of photosynthetic pigments. Physical Chemistry Chemical Physics : PCCP, 12(25), 6706–6711.

The input

[Input: recipes/electronicdynamics/spectrum/]

The following input can be used to calculate the absorption spectrum of chlorophyll a:

Geometry = GenFormat {
  <<< "coords.gen"
}

Hamiltonian = DFTB {
  SCC = Yes
  SCCTolerance = 1.0e-7
  MaxAngularMomentum = {
    Mg = "p"
    C = "p"
    N = "p"
    O = "p"
    H = "s"
  }
  Filling = Fermi {
    Temperature [K] = 300
  }
}

ElectronDynamics = {
   Steps = 20000
   TimeStep [au] = 0.2
   Perturbation = Kick {
     PolarizationDirection = all
   }
   FieldStrength [v/a] = 0.001
}

The optimised geometry is located in the coords.gen file. Note that for this example the long phytol chain present in the natural molecule has been replaced by a terminating hydrogen atom since it does not have a significant influence on the absorption spectrum.

For the calculation of absorption spectra, an initial kick of the system is made using a Dirac delta type perturbation. The input specifies that after the initial perturbation of Kick type, twenty thousand steps of dynamics will be executed using a time step of 0.2 atomic units. The Kick perturbation can be applied in any of the Cartesian directions (x, y or z). The use of all here in the input instructs the code to run three independent dynamic calculations, one with an initial Kick in each Cartesian direction.

After self consistency has been achieved and the ground state density matrix is obtained, the perturbation is applied and then the propagation starts, the output produced is the following:

S inverted
Density kicked along x!
Starting dynamics
Step        0  elapsed loop time:   0.012400  average time per loop   0.012400
Step     2000  elapsed loop time:  19.112000  average time per loop   0.009551
Step     4000  elapsed loop time:  35.407101  average time per loop   0.008850
Step     6000  elapsed loop time:  52.179100  average time per loop   0.008695
Step     8000  elapsed loop time:  68.688004  average time per loop   0.008585
Step    10000  elapsed loop time:  90.615501  average time per loop   0.009061
Step    12000  elapsed loop time: 109.174500  average time per loop   0.009097
Step    14000  elapsed loop time: 127.921097  average time per loop   0.009137
Step    16000  elapsed loop time: 147.406097  average time per loop   0.009212
Step    18000  elapsed loop time: 167.002502  average time per loop   0.009277
Step    20000  elapsed loop time: 185.372406  average time per loop   0.009268
Dynamics finished OK!
S inverted
Density kicked along y!
Starting dynamics
Step        0  elapsed loop time:   0.023700  average time per loop   0.023700
Step     2000  elapsed loop time:  28.003799  average time per loop   0.013995
Step     4000  elapsed loop time:  52.257900  average time per loop   0.013061
Step     6000  elapsed loop time:  74.137497  average time per loop   0.012354
Step     8000  elapsed loop time:  93.527603  average time per loop   0.011689
Step    10000  elapsed loop time: 115.045998  average time per loop   0.011503
Step    12000  elapsed loop time: 134.955200  average time per loop   0.011245
Step    14000  elapsed loop time: 155.862000  average time per loop   0.011132
Step    16000  elapsed loop time: 176.434799  average time per loop   0.011026
Step    18000  elapsed loop time: 197.430695  average time per loop   0.010968
Step    20000  elapsed loop time: 217.860703  average time per loop   0.010892
Dynamics finished OK!
S inverted
Density kicked along z!
Starting dynamics
Step        0  elapsed loop time:   0.012100  average time per loop   0.012100
Step     2000  elapsed loop time:  27.119101  average time per loop   0.013553
Step     4000  elapsed loop time:  48.640301  average time per loop   0.012157
Step     6000  elapsed loop time:  67.843803  average time per loop   0.011305
Step     8000  elapsed loop time:  87.514702  average time per loop   0.010938
Step    10000  elapsed loop time: 111.822601  average time per loop   0.011181
Step    12000  elapsed loop time: 133.397202  average time per loop   0.011116
Step    14000  elapsed loop time: 153.044098  average time per loop   0.010931
Step    16000  elapsed loop time: 176.008301  average time per loop   0.011000
Step    18000  elapsed loop time: 195.700104  average time per loop   0.010872
Step    20000  elapsed loop time: 216.208694  average time per loop   0.010810
Dynamics finished OK!

The resulting time dependent dipole moment along each Cartesian direction produced the kicks are stored in the mu*.dat output files.

The calculation of the spectrum makes use of the fact that the Fourier transform of induced dipole moment of the molecule in the presence of an external time dependent field (within the linear response range) is related to the Fourier transform of said field in the following manner:

\(\mathbf{mu}(\omega)=\overset\leftrightarrow{\alpha}(\omega)\mathbf{E}(\omega)\)

since the Fourier transform of a Dirac delta is a constant at all frequencies, the polarizability tensor \(\overset\leftrightarrow{\alpha}(\omega)\) can be obtained from the time dependent response. The absorption is proportional to the imaginary part of the trace of the polarizability tensor.

The calculation of the absorption spectrum is carried out using the script calc_timeprop_spectrum either available after make install of DFTB+, or located in the tools/misc directory under the dftbplus source tree. The invocation of the script is as follows:

calc_timeprop_spectrum -d 20.0 -f 0.001

The exciting field strength is specified with the -f flag, the -d flag specifies a damping constant used to exponentially damp the dipole signal to zero within the simulation time. This damping time is expressed in femtoseconds. The effect of damping the dipole moment is to add a uniform width to every spectral line and is necessary to smooth out any ringing in the spectrum peaks after the transform. In essence this damping procedure is equivalent to using a windowing function.

The spectrum is located in the output files spec-ev and spec-nm. In this case the spectrum looks as follows:

Absorption spectrum of chlorophyll a.

The band between 400 and 500 nm is called the Soret band and the one between 600 and 700 nm is the Q band. This band is the band that provides is responsible for the photo-biologic activity of chlorophylls as antennae capable of capturing solar energy in the primary process of photosynthesis.

Driving electronic dynamics with external fields

[Input: recipes/electronicdynamics/driving/]

In this section we will describe how to drive the Q band of chlorophyll a using both a continuous and pulsed laser. The first step is to find the transition dipole moment direction of the Q band. This is done using the calc_timeprop_maxpoldir script either available after make install of DFTB+, of located in the tools/misc directory under the dftbplus source tree. The invocation of the script is as follows:

calc_timeprop_maxpoldir -d 20.0 -w 636.0

which produces the output:

PolarizationDirection = -0.08808129 0.99564018 -0.03069709

these are the three Cartesian components of the transition dipole moment vector. This vector points in the direction in which a driving laser, if it is in resonance with the excitation, will have maximum absorption. This vector is the eigenvector of the polarizability tensor at that energy that has the largest eigenvalue. It is equivalent to a principal axis of inertia in rigid body rotation.

Using the ElectronDynamics block that follows:

ElectronDynamics = {
  Steps = 60000
  TimeStep [au] = 0.2
  Perturbation = Laser {
    PolarizationDirection = -0.08808129 0.99564018 -0.03069709
    LaserEnergy [eV] = 1.94944
  }
  FieldStrength [V/A] = 0.0001
}

We resonantly excite the Q band along the direction of its maximum polarizability. The obtained magnitude of the dipole moment as a function of time is shown in the following figure:

Dipole moment as a function of time.

An initial transient, the dipole moment maxima and minima grow in absolute value as a linear function of time after, confirming that the applied field is in resonance with the excitation and within the linear response regime. The slope of this growth is related to the transition dipole of the excitation.

Off-resonant excitation

[Input: recipes/electronicdynamics/driving-oot/]

An off resonance excitation at 1.9 eV produces the following result for the dipole moment:

Dipole moment as a function of time.

showing characteristic beats, the frequency of which are related to the amount of detuning. The amplitude of the dipole moment change caused by the illumination is also much smaller that when the laser is in tune with the excitation.

Ehrenfest dynamics

As explained above (in the introduction to the Electronic dynamics chapter) electron-nuclear dynamics within the Ehrenfest ansatz are based on the integration of the ionic equations of motion using the expectation value of the force that corresponds to the current density matrix. Nuclear dynamics drive electronic dynamics in turn through the time-dependent nuclear positions that appear in the Hamiltonian. The nuclear and electronic systems interact through expectation values and therefore Ehrenfest is a mean-field approximation to the actual coupled electron-ion dynamics. This approximation is useful in some situations; however, the neglected electron-ion correlations might be essential for some processes. The crossing of conical intersections, for example, is where the approximation breaks down drastically.

Benzene example

[Input: recipes/electronicdynamics/ehrenfest/]

As an example of Ehrenfest dynamics, starting from the equilibrium geometry we can excite the lowest-lying \(\pi-\pi^*\) excitation of a benzene molecule using a short laser pulse, and explicitly allow the ions to move. In this case, the ElectronDynamics blocks is as follows:

ElectronDynamics = {
  Steps = 100000
  TimeStep [au] = 0.1
  Perturbation = Laser {
    PolarizationDirection = 0.00000001 0.61419463 -0.78915459
    LaserEnergy [eV] = 6.834 # pi-pi* transition energy
  }
  EnvelopeShape = Sin2 {
    Time1 [fs] = 10.0 # end-time of the pulse
  }
  FieldStrength [V/A] = 0.10
  IonDynamics = Yes
  InitialTemperature [K] = 0.0
  Populations = Yes
}

The keyword IonDynamics is set to Yes and for this example, we are starting with zero initial atomic velocities. The short, but strong, laser pulse drives electrons from the HOMO to the LUMO and this sudden change in the electronic structure impulsively drives the nuclei to move. In the following figure, we plot the distances between neighbouring carbon atoms in the benzene ring:

Carbon-carbon distances for benzene following excitation by a laser pulse.

After the short excitation, all carbon atoms move in a breathing motion with all distances increasing and decreasing periodically. The breathing is not perfect but is a good approximation of the actual motion. The process can be explained if we picture the state of the system being in a coherent superposition between the ground and excited states. In this superposition, electrons which were occupying bonding orbitals now partially populate anti-bonding orbitals. In this highly symmetric molecule, all C-C bonds lose strength and therefore, after the excitation, the equilibrium position is displaced to a longer inter-atomic distance. This sudden change of effective potential energy surface drives the nuclear motion, which now oscillates around the new displaced equilibrium.

Parallel usage of DFTB+

Introduction

Why run in parallel?

There are two main reasons for using parallel computers:

  1. Faster throughput of results
  2. Ability to simulate larger systems

These are related to two concepts in parallel computing, strong and weak parallel scaling of software. Strong scaling is defined as the how the time to solve a problem varies with the number of computing processors for a fixed total problem size. Weak scaling is how the time varies when the problem size increases with the number of processors.

There can also be restrictions on the amount of available memory, and one solution is to use a parallel machine where the amount of available memory increases with the number of computing cores (usually a distributed memory machine, but shared memory systems often also have a substantial amount of memory per computing core).

Types of parallel hardware

The two types of parallel computer that DFTB+ currently can make use of are either

  1. shared memory machines, where all processes have access to the same data. These are typically small to medium size systems. Most modern CPUs have multiple cores, so fall into this category even for desktop machines.
  2. distributed memory which consists of network connected machines. These are typically larger scale systems with higher numbers of processors and a dedicated high speed network between them.

The different system types require distinct program models to make use of the hardware (however code designed for a distributed memory system can often be also used for shared memory architectures).

Shared memory parallel

The default compilation options for DFTB+ produce an OpenMP enabled parallel program for shared memory systems. The make.arch file for compiling the code should

  1. Include the necessary compiler and linking options for OpenMP. The supported make.arch examples already do this.
  2. A thread parallel LAPACK and BLAS is required and should be specified in make.arch, along with any extra thread communication libraries. Most modern implementations of LAPACK (MKL, openBLAS, ATLAS BLAS, etc.) support shared memory parallelism.

Distributed memory parallel

DFTB+ can also be compiled to use MPI parallelism, typically for distributed memory machines. This is usually required for larger parallel computers and problem sizes.

This requires additional computational and communication libraries to be available on your system. The system administrator(s) of your machine may be able to help locate or configure these for you. The required packages are

  • MPI : openMPI and MPICH are common options, but there may be a vendor supplied library for your network that has better performance
  • ScaLAPACK
  • LAPACK and BLAS : optimised serial implementations

Sections of the code are currently unable to operate with MPI parallelism (particularly the excited state calculations), but the majority of the functionality is the same as the shared memory version.

Hybrid parallelism

DFTB+ can be compiled with both MPI and openMP parallelism combined. However using this can require system specific settings for thread affinity to provide efficiency and this is beyond the scope of this tutorial.

Amdahl’s law

One of the major issues in the performance of a parallel program is the possible speed up from using parallelism. Amdahl’s law describes the expected gains for operations which are partially parallelised, i.e., where there are computational parts that must be performed in serial as well others which operate in parallel. If the fraction of the operations performed in parallel are p, then compared to serial operation, the expected speed-up is:

\[S(N) = \frac{ 1 }{ 1 - p + \frac{p}{N} }.\]

Amdahl’s law can provide a reasonable indication of the performance of a problem on increasing numbers of processors, depending on the parallelism of the task:

Amdahl's law for speed-up.

As you can see, only substantially parallel tasks give substantial speed up on larger numbers of processors. The limiting speed up for large numbers of processors is

\[\frac{1}{1 - p},\]

i.e., the inverse of the serial fraction of the code, which gains nothing from parallelism.

Amdahl’s does not include the effects of latency or the cache hierarchy on parallel performance, but can be a good guide to the limits of scalability for fixed sizes of problem on parallel machines.

Gustafson’s law

Amdahl’s law assumes that the size of the computational problem remains fixed as the number of processes increases. However, it is often more common that the problem is increased along with the number of processes.

This situation is partly described by Gustafson’s law. However in the case of DFTB the computational effort for conventional diagonalisation (the dominant part for most calculations) grows as the cube of the number of atoms. This complicates the situation. In principle idealised weak scaling where the number of atoms increases proportional to the number of processors would give a time to solution that grows as the square of the number of atoms.

See for example Snyder, Annu. Rev. Comput. Sci. 1: 289, 1986 for a discussion of these issues.

Compiling the code

Compiling for OpenMP

The default make.arch examples for DFTB+ should enable OpenMP parallelism, but in order to gain from this you will also require an efficient thread parallelised BLAS library. These include MKL or OpenBLAS.

The available dftb+ binaries are compiled with OpenMP support and use the OpenBLAS libraries.

You will also usually need to specify the number of parallel threads that the code can use. This can be done by setting the OMP_NUM_THREADS environment variable

  • For the bash shell:

    export OMP_NUM_THREADS=<number of threads to use>
    
  • For the csh or tcsh shells:

    setenv OMP_NUM_THREADS <number of threads to use>
    

before running DFTB+.

Compiling with MPI

In order to use Message Passing Interface (MPI) enabled DFTB+, you will require

  1. A working MPI 2 (or later) installation. This should include a wrapper for Fortran compilation, which is named some variant on mpifort or mpif90 for most MPI implementations.
  2. The ScaLAPACK library (or a library such as MKL that supplies this functionality).
  3. Either serial or thread aware LAPACK and BLAS libraries (or equivalents).

Depending on your system, optimised MPI libraries for your network may be available (contact your system manager for details).

Obtaining the source

If you have downloaded the official 18.1 (or later) release from the DFTB+ website, all essential components for compiling the code in parallel with MPI support are included (excluding the prerequisites listed above).

If instead you obtain the code from the github repository, the mpifx and scalapackfx libraries will be required. These are included as submodules from the main code, and can be fetched with:

git submodule update --init --recursive

There is more information on the structure and use of submodules online.

Building DFTB+ with MPI support enabled

You will need to either edit the file make.config to enable:

WITH_MPI := 1

or compile with

make WITH_MPI=1

to produce an MPI enabled binary. In this case, we suggest serial LAPACK and BLAS libraries are used (since the main parallelism comes from ScaLAPACK).

Benchmarking and scalability

DFTB+ has an internal timer for various significant parts of its calculations. This can be enabled by adding the following option to the input:

Options {
  TimingVerbosity = 2
}

This will activate the timers for the most significant stages of a calculation. Higher values increase the verbosity of timing, while a value of -1 activates all available timers.

The typical output of running the code with timers enabled for a serial calculation will end with lines that look something like

--------------------------------------------------------------------------------
DFTB+ running times                          cpu [s]             wall clock [s]
--------------------------------------------------------------------------------
Pre-SCC initialisation                 +     3.46  (  5.8%)      3.45  (  5.8%)
SCC                                    +    47.34  ( 79.6%)     47.34  ( 79.6%)
  Diagonalisation                           44.79  ( 75.3%)     44.79  ( 75.3%)
  Density matrix creation                    2.21  (  3.7%)      2.21  (  3.7%)
Post-SCC processing                    +     8.68  ( 14.6%)      8.68  ( 14.6%)
  Energy-density matrix creation             0.55  (  0.9%)      0.55  (  0.9%)
  Force calculation                          7.74  ( 13.0%)      7.74  ( 13.0%)
  Stress calculation                         0.94  (  1.6%)      0.94  (  1.6%)
--------------------------------------------------------------------------------
Missing                                +     0.00  (  0.0%)      0.00  (  0.0%)
Total                                  =    59.48  (100.0%)     59.47  (100.0%)
--------------------------------------------------------------------------------

This shows the computing time (cpu) and real time (wall clock) times of major parts of the calculation.

An alternative method to obtain total times is by using the built-in shell command time when running the DFTB+ binary

time dftb+ > output

In the above example, at the termination of the code timings will be printed:

real    0m59.518s
user    0m59.430s
sys     0m0.092s

More advanced timing is possible by using profiling tools such as gprof.

Examples

Shared memory parallelism

The strong scaling for the OpenMP parallel code for some example inputs is shown here

Strong scaling for some SiC supercells.

These results were calculated on an old Xeon workstation (E5607 @ 2.27GHz) and are a self-consistent calculation of forces for different sized SiC supercells with 1 k-point and s and p shells of orbitals on the atoms. Timings and speed-ups come from data produced from the code’s reported total wall clock time.

There are several points to note

  1. The parallel scalability improves for the larger problems, going from ~68% to ~79% on moving from 512 atoms to 1728. This is a common feature, that larger problems give sufficient material for parallelism to work better.
  2. The gain in throughput for these particular problems is around a factor of 2 when using 4 processors, and raises to around 3 on 8 processors for the largest problem.
  3. From Amdhal’s law we can estimate the saturating limits for large numbers of processors as ~3.1 and ~4.8 for the smallest and largest problems respectively. This implies that there is not much value in using more than ~4 processors for the smallest calculation, since this has already gained around 2/3 of its theoretical maximum speed up. Adding a 5th or 6th processor will only improve performance by ~5% each, so is probably a waste of resources. Similarly for the largest calculation in this example, 6 processors gives around a factor of 3 speed up compared to serial operation, but adding 7th processors will only speed the calculation up by ~6%.
  4. The experimental data does not align exactly with the Amdahl curves, this could be due to competing processes taking resources (this is a shared memory machine with other jobs running) or the problem may run anomalously well for a particular number of processes. In this example 4 processors consistently ran slightly better, perhaps due to the cache sizes on this machine allowing the problem to be stored higher in the memory hierarchy for 4 processors compared to 3 (thus saving some page fetching).
  5. The weak scaling (increasing the number of processors proportional to the number of atoms) shows an approximately \(O(N^2)\) growth in time. A serial solution of these problems would increase as \(O(N^3)\) in the number of atoms.
  6. These timings are for this specific hardware and these particular problems, so you should test the case you are interested in before deciding on a suitable choice of parallel resources.

Weak scaling from the same data set is shown here

Weak scaling for some SiC supercells.
Distributed memory parallelism

This section coming soon!

Topics to be discussed:

  • Parallel scaling of simple examples.
  • Parallel efficiency.
  • Use of the Groups keyword in DFTB+ to improve scaling and efficiency for calculations with spin polarisation and/or k-points.
  • Effects of latency on code performance.

Property calculations

This chapter discusses some of the property calculations that are possible with DFTB+, in some cases via the use of external packages and tools.

Phonon calculations with phonopy

The phonopy code can calculate a range of harmonic and quasi-harmonic vibrational properties and from version 2.0 onwards supports DFTB+. Information about how to install phonopy is available.

The below examples were tested with phonopy v2.1.2.

Phonon band structure

The diamond lattice has very high symmetry, hence a phonon band structure can be obtained with a single calculation. A conventional unit cell with relaxed lattice constant is provided in geo.gen (the phonopy input assumes this is the name of the suplied starting geometry).

The DFTB+ input needs to calculate the atomic forces and also to write a results tag file, hence the dftb_in.hsd input contains:

Analysis = {
  # required option for phonopy
  CalculateForces = Yes
}
Options = {
  # Required options for storing data needed by phonopy
  WriteResultsTag = Yes
}

The (single) distorted geometry and the required phonopy_disp.yaml file is then generated with the command

phonopy -d --dim="4 4 4" --dftb+

This constructs a 4x4x4 supercell of the primitive cell and saves the undistorted and distorted supercells as geo.genS and geo.genS-001 respectively. Note that you should test the required number of repeats in supercell required to reach convergence of the band structure (or other properties of interest).

For the single (in this case) relevant geo.genS-* file, calculate the DFTB forces on the atoms, retaining the resulting results.tag file. This example uses the mio Slater-Koster paramters for carbon.

Then create the required FORCE_SETS file

phonopy -f results.tag --dftb+  ...

This assumes that the results.tag and phonopy_disp.yaml files are in the same directory (for more complex cases the calcuation should be run in the same directory as the phonopy files and the path to directories containing the DFTB+ output files).

Then specify the path in the Brillouin zone you are interested in (see the phonopy documentation). Then post-process the phonopy data, providing the dimensions of the the supercell repeat either on the command line or in the settings file (a DIM file):

phonopy -p band.conf --dim="4 4 4" --dftb+

Finally create a band structure in gnuplot format

phonopy-bandplot --gnuplot band.yaml > band.dat

The resulting band structure for the mio carbon model is shown in Figure 20.

Resulting phonon band structure for diamond.

The resulting phonon band structure for diamond.

Reaction barriers

DFTB+ does not internally contain routines for following reaction barriers.

See the Nudged Elastic Band - NEB section for the use of an external tool to calculate this.

REKS

This chapter discusses REKS (spin-Restricted Ensemble Kohn-Sham) calculations for ground and low-lying excited states. REKS can treat states with strong static correlation (J. Phys. Chem. A 104, 6628) and also produces the correct spin symmetry in cases where unrestricted methods suffer from spin contamination. REKS also allows the calculation of, for example, states with multi-reference character as well as the vertical excitation energies between states. REKS can be classified as Single-state REKS, SA-REKS and SI-SA-REKS.

In this chapter we will

  • provide some example calculations of basic energy and gradient evaluation,
  • discuss additional functionality such as level shifting and spin tuning,
  • demonstrate advanced calculations including non-adiabatic couplings between the states and relaxed densities which can be used for QM/MM calculations.

Finally, we will discuss how to diagnose and handle some the problems which can specifically arise in REKS calculations.

Energy and gradient calculations with REKS

Single-state REKS

[Input: recipes/reks/single_state_reks/]

In single-state REKS, the ground state is calculated including multi-reference character. For example, a ethylene molecule with a torsional angle of 90 degrees has degenerate \({\pi}\) and \({\pi}^*\) orbitals, corresponding to the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) respectively. In this case a standard SCC-DFTB calculation can show oscillating phenomena when trying to obtain self-consistent charges and is missing the static electronic correlations required for a correct variational energy. Similarly, REKS can deal with bond-breaking, di-radicals and magnetic couplings which are difficult in standard calculations.

To deal with the static electronic correlation, REKS first determines the size of the space of active orbitals. Only two electrons in two frontier orbitals, (2,2), is supported in current version of DFTB+.

Note

From the active space, REKS automatically construct possible electronic configurations, called microstate (See the paper JCTC, 2019, 15, 3021-3032.). A total of 6 microstates are generated in a REKS(2,2) calculations, for example the 5th and 6th microstates express high-spin configurations in the (2,2) space. To correctly describe the spin polarised configurations in the space the SpinConstants block is required in the Hamiltonian. For example, there are more up-electrons in the 5th microstate than down-electrons, thus atomic spin constants are needed to describe spin contributions. Note that the user should not include the SpinPolarisation block, only the SpinConstants block is needed.

With this active space, single-state REKS can be carried out using the following input:

Hamiltonian = DFTB {
  SCC = Yes
  SpinConstants = {
    ShellResolvedSpin = Yes
    H = { -0.072 }
    C = { -0.031 -0.025 -0.025 -0.023 }
  }
  ...
}
REKS = SSR22 {
  Energy = {
    Functional = { "PPS" }
  }
  TargetState = 1
  FonMaxIter = 100
  Shift = 1.0
  VerbosityLevel = 1
}

Note that the REKS = SSR22 block is a separate environment in the calculation, so is not included in the Hamiltonian block. In a REKS calculation, the Hamiltonian only controls detailed Hamiltonian settings such as use of a range-separated hybrid functional or dispersion corrections. Three singlet states can be generated from a (2,2) active space. These are the perfectly spin-paired singlet state (PPS), an open-shell singlet state (OSS) and the doubly excited singlet state (DES). Combining these singlet states, we can find the many-particle singlet state which minimises the energy functional. In the single-state REKS case we treat only one singlet state of PPS symmetry, thus this is the only possibility that can be selected in the Functional block of the Energy in this case.

The energy for the PPS state is expressed as

\[E_{PPS} = \sum_L C_L^{PPS} E_L[\rho_L]\]

where \(C_L^{PPS}\) are the weighting factors of each microstate in the resulting PPS state and \(E_L\) are their energies. Here the weighting factors are calculated using the fractional occupation numbers (FONs) of frontier orbitals, thus the energy of the resulting PPS state is minimised with respect to both the Kohn-Sham orbitals and the FONs in single-state REKS.

[Input: recipes/reks/single_state_reks/dftb_in.hsd-0deg]

With this inputs, the standard output shows information about the energy of the PPS state and the FONs contributing to it. The minimised geometry for ethylene is a planar structure and the resulting energy and FONs are given in the standard output:

--------------------------------------------------
  Final REKS(2,2) energy:      -4.89357215

 State     Energy      FON(1)    FON(2)   Spin
  PPS   -4.89357215   1.999990  0.000010  0.00
--------------------------------------------------

The energy of PPS state is -4.8936 Hartree and the FONs of frontier orbitals are ~2.0 and ~0.0, respectively. The orbital character for HOMO is \(\pi\) and the character for LUMO is \(\pi^*\), so the two electrons are occupied in the HOMO.

[Input: recipes/reks/single_state_reks/dftb_in.hsd-90deg]

When the torsional angle begins to rotate to 90 degree, the molecular orbitals change from \(\pi\) and \(\pi^*\) orbitals into two localised single \(\pi\) orbitals. This means the frontier orbitals are now almost degenerate, and this is well described with REKS(2,2) calculations. The resulting energy and FONs are:

--------------------------------------------------
  Final REKS(2,2) energy:      -4.77774313

 State     Energy      FON(1)    FON(2)   Spin
  PPS   -4.77774313   1.000004  0.999996  0.00
--------------------------------------------------

Now, the energy of PPS state becomes -4.7777 Hartree and the FONs become ~1.0 and ~1.0, respectively. In addition, the user can check the energy contributions to the PPS state from detailed.out file. The energy from the spin contribution is ~0.0 Hartree for the planar structure, while the energy becomes -0.0188 Hartree for the 90 degree rotated structure. Overall, we can conclude that the planar structure almost consists of only the first microstate, while the rotated structure consists of several microstates.

SA-REKS

[Input: recipes/reks/sa_reks/]

Single-state REKS can treat only the ground state, thus the vertical excitation energy cannot be calculated with this method. From the restricted open-shell Kohn-Sham scheme, we can construct the OSS state, which is expressed as

\[E_{OSS} = \sum_L C_L^{OSS} E_L[\rho_L]\]

where \(C_L^{OSS}\) is weighting factors of each microstate making up the OSS state. By minimising the energy for the ensemble of PPS and OSS states, we can calculate the vertical excitation energy between them with state-average REKS (SA-REKS). Again we calculate the energy of the PPS and OSS states of the two geometries of an ethylene molecule with SA-REKS. The REKS block has now an additional Gradient block to calculate the gradient and optimise for the target state:

REKS = SSR22 {
  Energy = {
    Functional = { "PPS" "OSS" }
  }
  TargetState = 1
  FonMaxIter = 100
  Shift = 1.0
  Gradient = ConjugateGradient {
    CGmaxIter = 100
    Tolerance = 1.0E-8
    Preconditioner = Yes
    SaveMemory = Yes
  }
  VerbosityLevel = 1
}

The user also now has to include the OSS state in addition to the PPS in the Functional block so that the energy of both are now calculated. The resulting energy and additional information is given by:

--------------------------------------------------
 Final SA-REKS(2,2) energy:      -4.78921495

 State     Energy      FON(1)    FON(2)   Spin
  PPS   -4.89357215   1.999990  0.000010  0.00
  OSS   -4.68485776   1.000000  1.000000  0.00
 Trip   -4.73085776   1.000000  1.000000  1.00
--------------------------------------------------

 Lagrangian Wab:   0.00000000  0.00000000

--------------------------------------------------
 SSR: 2SI-2SA-REKS(2,2) Hamiltonian matrix
               PPS           OSS
   PPS    -4.89357215    0.00000000
   OSS     0.00000000   -4.68485776
--------------------------------------------------

 unrelaxed SA-REKS FONs for S0:  1.999990  0.000010

The final SA-REKS(2,2) energy is from the ensemble of the PPS and OSS states, the energy of which is being minimised. For the planar structure of an ethylene molecule, the energies of two states are -4.8936 and -4.6849 Hartree, respectively. The FONs for the PPS state are ~2.0 and ~0.0, while those for OSS state are ~1.0 and ~1.0. In addition, the energy of the triplet configuration which corresponds to contributions from the 5th and 6th microstates is now given in the standard output. Note that this energy is not the energy of the triplet state. The user can check for successful convergence by comparing the two Lagrangian Wab values, these will become almost the same if the energy is well converged.

After the energy calculation is finished, the gradient for the target state (TargetState = 1, which is the PPS state in this example) is calculated and the final gradient appears at the bottom of the standard output. The keywords in the Gradient block affect coupled-perturbed REKS (CP-REKS) equations which are used to calculate the gradient of target state. Here we choose a conjugate gradient solver and the keywords Preconditioner and SaveMemory are used to accelerate the computational speed of CP-REKS. These two keywords may be switched off depending on the user’s system.

Similar to the case above, the distorted structure can be calculated using SA-REKS and the results are given in the following:

--------------------------------------------------
 Final SA-REKS(2,2) energy:      -4.75910919

 State     Energy      FON(1)    FON(2)   Spin
  PPS   -4.77753765   1.000000  1.000000  0.00
  OSS   -4.74068073   1.000000  1.000000  0.00
 Trip   -4.77753837   1.000000  1.000000  1.00
--------------------------------------------------

 Lagrangian Wab:  -0.00004046  0.00004230

--------------------------------------------------
 SSR: 2SI-2SA-REKS(2,2) Hamiltonian matrix
               PPS           OSS
   PPS    -4.77753765   -0.00000000
   OSS    -0.00000000   -4.74068073
--------------------------------------------------

 unrelaxed SSR FONs for S0:  1.000000  1.000000

Now the energy of OSS state is -4.7407 Hartree. The energy of the triplet configuration is similar to the energy of PPS state. Since REKS can consider the static electronic correlations, it can produce the correct shape for the potential energy curve with respect to the torsional angle of C=C bond. If you want to calculate the energy of the DES state, then IncludeAllStates = Yes keyword in the Energy block will produce the energy of DES state as well as its FONs.

SI-SA-REKS

[Input: recipes/reks/si_sa_reks/]

State-interaction SA-REKS (SI-SA-REKS, briefly SSR) energies are obtained by solving a 2 \(\times\) 2 secular equation with the possible couplings between the SA-REKS states.

\[\begin{split}\left(\begin{array}{cc} E^{PPS} & \Delta^{SA} \\ \Delta^{SA} & E^{OSS} \end{array}\right) \left(\begin{array}{cc} a_{00} & a_{01} \\ a_{10} & a_{11} \end{array}\right) = \left(\begin{array}{cc} E^{SSR}_0 & 0 \\ 0 & E^{SSR}_1 \end{array}\right) \left(\begin{array}{cc} a_{00} & a_{01} \\ a_{10} & a_{11} \end{array}\right)\end{split}\]

By considering the state-interaction terms, the SSR states become more reliable when the excited states are included. The SSR states can be calculated with the StateInteractions = Yes in Energy block. For the planar structure, the resulting energies are given by:

----------------------------------------------------------------
 SSR: 2SI-2SA-REKS(2,2) states
                    E_n       C_{PPS}    C_{OSS}
 SSR state  1   -4.89357215  -1.000000   0.000000
 SSR state  2   -4.68485776  -0.000000  -1.000000
----------------------------------------------------------------

In this case, the ground state consists of the PPS state, while the lowest excited state is of OSS type. As the coupling term increases, the difference between the SA-REKS and SSR energies becomes larger.

Advanced calculations

Non-adiabatic couplings

[Input: recipes/reks/nonadiabatic_coupling/]

The non-adiabatic coupling vectors between the states can be calculated with the SSR method. By setting NonAdiabaticCoupling = Yes in REKS block, the user can easily calculate the non-adiabatic coupling vectors. When this option is turned on, SSR shows the gradients for all states and possible coupling vectors:

--------------------------------------------------
 Gradient Information
--------------------------------------------------
 1 st state (SSR)
     0.09734680    -0.00000000     0.00000000
    -0.09734682    -0.00000000    -0.00000000
     0.00639517     0.00417744    -0.00000000
     0.00639517    -0.00417744     0.00000000
    -0.00639516     0.00417743    -0.00000000
    -0.00639516    -0.00417743     0.00000000

 2 st state (SSR)
    -0.12803706    -0.00000000    -0.00000000
     0.12803704    -0.00000000     0.00000000
     0.00639486     0.00417837    -0.00000000
     0.00639486    -0.00417837     0.00000000
    -0.00639485     0.00417836    -0.00000000
    -0.00639485    -0.00417836     0.00000000
--------------------------------------------------
 Coupling Information
--------------------------------------------------
 between  1 and  2 states
--------------------------------------------------
 g vector - difference gradient
     0.11269193    -0.00000000     0.00000000
    -0.11269193     0.00000000    -0.00000000
     0.00000016    -0.00000046     0.00000000
     0.00000016     0.00000046     0.00000000
    -0.00000016    -0.00000046     0.00000000
    -0.00000016     0.00000046     0.00000000

 h vector - derivative coupling
    -0.00031874    -0.00000000     0.00000000
    -0.00031874     0.00000000    -0.00000000
     0.00015937     0.00007242     0.00000000
     0.00015937    -0.00007242     0.00000000
     0.00015937    -0.00007242     0.00000000
     0.00015937     0.00007242     0.00000000

 G vector - GDV
     0.11269193    -0.00000000     0.00000000
    -0.11269193     0.00000000    -0.00000000
     0.00000016    -0.00000046     0.00000000
     0.00000016     0.00000046     0.00000000
    -0.00000016    -0.00000046     0.00000000
    -0.00000016     0.00000046     0.00000000

 H vector - DCV - non-adiabatic coupling
    -0.00152718    -0.00000000     0.00000000
    -0.00152718     0.00000000    -0.00000000
     0.00076359     0.00034700     0.00000000
     0.00076359    -0.00034700     0.00000000
     0.00076359    -0.00034700     0.00000000
     0.00076359     0.00034700     0.00000000
--------------------------------------------------

The above gradients and coupling vectors are obtained with the planar structure of a ethylene molecule. The g vector is defined as gradient difference vector, thus it can be calculated from the difference of SA-REKS gradients. Similarly to this, G vector is calculated from the difference of SSR gradients. The h vector is defined as coupling gradient, so it can be simply calculated from the gradient of state-interaction terms. The H vector is defined as the derivative of the coupling vectors, thus its norm increases as the energy gap becomes smaller.

The g and h vectors can be regarded as the vectors defined through diabatic states, and the G and H vectors are defined through the adiabatic (SSR) states. In general, the non-adiabatic coupling vectors can be used for surface hopping molecular dynamics, while the g and h vectors can be used for minimum energy conical intersection (MECI) optimisation.

Relaxed Density

[Input: recipes/reks/relaxed_density/]

The relaxed density is calculated for the TargetState when the user sets RelaxedDensity to Yes. The calculation of a relaxed density requires the information about gradient, thus it can be calculated when the input enables gradient calculation. When this option is turned on, the relaxed FONs are given in the bottom of standard output and the relaxed_charge.dat file is generated. It includes the total charge as well as the Mulliken charges of each atom for target state. This example shows relaxed charges for the ground state:

total charge:     -0.00000000 (e)

relaxed S0 atomic charge (e)
   atom        charge
      1      -0.18168616
      2      -0.18168616
      3       0.09084308
      4       0.09084308
      5       0.09084308
      6       0.09084308

If one want to exploit SSR with a QM/MM approach, the RelaxedDensity option should be used to obtain the gradient of external point charges. Then, the gradient of external point charges for target state are given in detailed.out file. Therefore, this option can be used to run surface hopping dynamics with a QM/MM approach.

Spin tuning constants

The DFTB/SSR method well describes equilibrium geometries and vertical excitation energies as compared with SSR/wPBEh results. (See the paper JCTC, 2019, 15, 3021-3032.) However, the behaviours at MECI points sometimes does not match those obtained with SSR/wPBEh. For example, the n/\(\pi^*\) type MECI geometry of ethylene or methyliminium molecule cannot be located with DFTB/SSR, with an incorrect description of the relative stability of the PPS and OSS states being mostly responsible. Their relative stability depends on the splitting between the open-shell singlet microstates and the triplet microstates in the PPS and OSS energies.

In principle, DFTB/SSR employs spin-polarised DFTB formalism, in which the spin-polarisation contribution is obtained from the second-order expansion of the magnetisation density with respect to zero magnetisation electronic structure. At the n/\(\pi^*\) type MECI geometries, both frontier orbitals are located on the same atom. In such a case, the second-order expansion of magnetisation may not be suitable for the triplet microstates, as the spin density becomes too large. As a simple solution, the stability between the PPS and OSS states can be adjusted by scaling the atomic spin constants. For most molecules the FONs for the PPS state become \(n_a\) ~ 2.0 and \(n_b\) ~ 0.0, hence the energy of the PPS state is determined by the 1st microstate alone and it is only the energy of the OSS state that depends on the atomic spin constants.

If the user runs the test calculation included with the main DFTB+ repository in the test/prog/dftb+/reks/PSB3_2SSR_rangesep_tuning directory, the following results are given in the standard output:

----------------------------------------------------------------
 SSR: 2SI-2SA-REKS(2,2) states
                    E_n       C_{PPS}    C_{OSS}
 SSR state  1  -16.39950035  -0.955182  -0.296020
 SSR state  2  -16.38979921   0.296020  -0.955182
----------------------------------------------------------------

H vector - DCV - non-adiabatic coupling
   -0.11443527     0.16066530     0.18737790
    0.15477051    -0.03677914     0.02047645
   -0.81366262     0.56406749    -1.41489688
    0.69877935    -2.68846832     1.70292281
   -0.34893713     0.81739627     0.50771020
   -0.08309257     0.13356401     0.02704612
    0.99884931     0.52013559    -1.04847579
   -0.46122482     0.81158082    -0.36230963
   -0.03763611     0.00213152     0.02451782
    0.65298995     0.42710364    -0.20222417
   -0.52172598    -0.75236687     1.00033386
   -0.24407958    -0.11018546    -0.50337767
    0.04971872    -0.03074284     0.01547016
    0.06968625     0.18189801     0.04542884

It shows the energies at MECI point of a PSB3 molecule, thus the non-adiabatic coupling vectors show large elements near to the centre C=C bond. The atomic spin constants can be modified by using SpinTuning keyword in REKS block as follows:

Reks = SSR22 {
  Energy = {
    Functional = { "PPS" "OSS" }
    StateInteractions = Yes
  }
  TargetState = 2
  FonMaxIter = 30
  shift = 0.3
  SpinTuning = { 3.2 3.2 3.2 }
  TransitionDipole = Yes
  Gradient = ConjugateGradient {
    CGmaxIter = 100
    Tolerance = 1.0E-8
    Preconditioner = Yes
    SaveMemory = Yes
  }
  RelaxedDensity = Yes
  NonAdiabaticCoupling = Yes
  VerbosityLevel = 1
}

Microstate calculation

[Input: recipes/reks/microstate/]

Obviously, the SSR method treats only singlet states like PPS or OSS. If one want to compare the energy of singlet and triplet states, SSR provides the energy of a triplet configuration as an alternative which corresponds to the 5th or 6th configuration in the (2,2) active space. Thus, the user can easily compare the energy of singlet and triplet microstates.:

--------------------------------------------------
 Final SA-REKS(2,2) energy:      -4.75910919

 State     Energy      FON(1)    FON(2)   Spin
  PPS   -4.77753765   1.000000  1.000000  0.00
  OSS   -4.74068073   1.000000  1.000000  0.00
 Trip   -4.77753837   1.000000  1.000000  1.00
--------------------------------------------------

In this example, for a distorted structure of ethylene, the energy of triplet microstate is close to that of the PPS state, since the frontier orbitals are the localised \(\pi\) orbitals in this system. If one want to know the gradient of the triplet microstate as well as its relaxed density, these can be obtained by using TargetMicrostate keyword in REKS block. If the value for this keyword is to 5, then the properties will be calculated according to the index of the microstate. In a (2,2) active space, the 5th microstate indicates a triplet configuration, thus the output shows the quantities for this microstate. The following results are obtained from the distorted structure of ethylene molecule:

--------------------------------------------------
 Gradient Information
--------------------------------------------------
 5 microstate
    -0.00926010    -0.00000000     0.00000000
     0.00926011     0.00000000    -0.00000000
     0.00096561     0.00066166     0.00000008
     0.00096561    -0.00066166    -0.00000008
    -0.00096562    -0.00000009     0.00066176
    -0.00096562     0.00000009    -0.00066176
--------------------------------------------------

The gradient is now calculated for the 5th microstate. In addition, the energy of spin contribution in the detailed.out file is -0.023, Hartree which corresponds to the spin constant \(W_{pp}\) for a carbon atom. In this case the frontier orbitals consisted of only p orbitals of a carbon atom, thus the energy of the spin contribution mostly consists of interactions between these orbitals. With this option, one can run the molecular dynamics simulation for the triplet microstate.

This example shows an input file for calculation of a triplet microstate:

Reks = SSR22 {
  Energy = {
    Functional = { "PPS" "OSS" }
  }
  TargetState = 1
  TargetMicrostate = 5
  FonMaxIter = 100
  shift = 20.0
  Gradient = ConjugateGradient {
    CGmaxIter = 100
    Tolerance = 1.0E-8
    Preconditioner = Yes
    SaveMemory = Yes
  }
  VerbosityLevel = 1
}

Note that TargetMicrostate keyword can be used only with the SA-REKS input settings discussed above.

Error handling with REKS

Oscillation of energy during SCC

[Input: recipes/reks/oscillation_energy/]

The oscillation of SA-REKS energy in the SCC process may appear as follows:

646       -8.9586042135       0.0000229869     0.621642    0.00993712
647       -8.9586272004      -0.0000229869     0.601626    0.00993712
648       -8.9586042135       0.0000229869     0.621642    0.00993712
649       -8.9586272004      -0.0000229869     0.601626    0.00993712
650       -8.9586042135       0.0000229869     0.621642    0.00993712
651       -8.9586272004      -0.0000229869     0.601626    0.00993712
652       -8.9586042135       0.0000229869     0.621642    0.00993712
653       -8.9586272004      -0.0000229869     0.601626    0.00993712
654       -8.9586042135       0.0000229869     0.621642    0.00993712

These oscillations mostly arise from the degeneracy of the frontier orbitals. As the difference between frontier orbitals gets smaller, it becomes harder to determine the order of the frontier orbitals during the SCC process. As a result, oscillating phenomena appear in these cases. To solve this, the user can increase the shift value by setting Shift in REKS block, which increases the separation in energy between the lower and upper orbital, stabilising the convergence. Smoother variational convergence of the SA-REKS energy can occur in the SCC process:

  111       -8.9586871392      -0.0000000004     0.610476    0.00000117
  112       -8.9586871395      -0.0000000003     0.610474    0.00000110
  113       -8.9586871398      -0.0000000003     0.610473    0.00000103
  114       -8.9586871401      -0.0000000003     0.610471    0.00000097

--------------------------------------------------
 Final SA-REKS(2,2) energy:      -8.95868714

 State     Energy      FON(1)    FON(2)   Spin
  PPS   -8.97337341   1.220943  0.779057  0.00
  OSS   -8.94400087   1.000000  1.000000  0.00
 Trip   -8.97307659   1.000000  1.000000  1.00
--------------------------------------------------

Thus, one can check the FONs for the PPS state become \(n_a\) ~1.2 and \(n_b\) ~0.8 in this example. As the orbital energies of the active orbitals are close each other, the user should increase the shift value as well as the number of MaxSCCIterations.

Oscillation of SA-REKS energy can also occur with the following irregular pattern:

  84      -11.5656112187       0.0000347081     0.535896    0.00170908
  85      -11.5616527044       0.0039585143     0.533173    0.01062304
fa =  0.471347, MO swap between a(  14) and b(  15) occurs
  86      -11.5390645462       0.0225881582     0.442694    0.05641408
fa =  0.460378, MO swap between a(  14) and b(  15) occurs
  87      -11.5618504576      -0.0227859114     0.420756    0.00750593
fa =  0.481839, MO swap between a(  14) and b(  15) occurs
  88      -11.5564810194       0.0053694382     0.463678    0.02941886
  89      -11.4916836033       0.0647974162     0.993664    0.50049487
  90      -11.5138647944      -0.0221811911     0.999927    0.21955679

Selection energy functional to be minimised

[Input: recipes/reks/benzene/]

A benzene molecule has two degenerate \(\pi\) orbitals and two degenerate \(\pi^*\) orbitals. Thus, if one want to calculate this case, then the active space should include two \(\pi\) and two \(\pi^*\) orbitals. Hence, a (2,2) active space may be insufficient to investigate this system, but the ground state can be optimised with DFTB/SSR(2,2). Since there is an ambiguity in deciding the active space, the OSS state cannot be correctly generated. Thus, only single-state REKS is recommended for C6H6, and the resulting energy of the PPS state is given as:

--------------------------------------------------
  Final REKS(2,2) energy:     -12.56867223

 State     Energy      FON(1)    FON(2)   Spin
  PPS  -12.56867223   2.000000  0.000000  0.00
--------------------------------------------------

Note that the orbital energy of the upper frontier orbital (in this case, the 16th orbital) in the band.out file is not correctly provided since there is an ambiguous part in the single-state REKS method when the Fock matrix is constructed as follows:

14    -6.700  2.00000
15    -6.700  2.00000
16    -7.583  0.00000
17    -1.399  0.00000

This problem only occurs in single-state REKS, thus the band energies are correctly calculated for the SA-REKS or SSR methods.

According to the system, there may exist instability for the lowest excited singlet state, which corresponds to the OSS state. In this case, including only the PPS state to minimised the energy functional is recommended, then the geometry will be optimised successfully.

Failure of CP-REKS equations

[Input: recipes/reks/acetylene/]

In general, a preconditioned conjugate-gradient algorithm is used to solve CP-REKS equations. It calculates quickly and accurately for most molecules, but it may fail to calculate a preconditioner due to the system symmetry. The stable geometry of an acetylene is planar and shows a high symmetry, thus a singularity appears as in this case as follows:

ERROR!
-> A singularity exists in preconditioner for PCG, set Preconditioner = No

In this case, using conjugate-gradient without a preconditioner helps to solve CP-REKS equations. The CP-REKS equations are then successfully solved as follows:

----------------------------------------------------------------------------------
 Solving CP-REKS equation for  1 state vector...
----------------------------------------------------------------------------------
  CG solver: Constructing Y initial guess
  CG solver: Iteration    1    eps =    0.000000000000
  Convergence reached in CG solver after    1 iterations
  CG solver: Calculating converged R, Z, Q2 matrix
----------------------------------------------------------------------------------

Therefore, the user can check the geometry and symmetry when these errors occur.

Electron transport

This section contains recipes for various electron-transport related calculations where a molecular or periodic structure is attached to semi-infinite contacts. This then be used to calculate the transport of electrons though the structure (and also to examine a modified region in an otherwise perfect structure).

Specifying the geometry

Partitioning the system

In the simplest case, transport calculations can be performed on an atomistic structure comprising of two semi-infinite contacts. These contacts are usually one-dimensional periodic systems (wire-like) and connect to a device region.

In order to carry out a transport calculation with DFTB+, the geometry of the system must be carefully partitioned by the user and the structure must contain:

  1. The extended molecule (the central region),
  2. Two Principal Layers (see below) for the first contact,
  3. Two Principal Layers of the second contact.

(additional contacts can also be included, but we will first start with the most common situation of a 2 contact geometry).

Partitioning of a device

Partitioning of a simple system into an extended molecule and perfect contacts.

The extended molecule contains the atomistic device itself plus those parts of the attached contacts, which are directly influenced by the presence of the device (we can call them surfaces). Each contact, on the other hand, contains those parts of the actual contacts, which are far enough from the device not to be affected by it (the examples below will better clarify this point).

Partitioning the system and contacts

The most important concept to bear in mind is that of principal layers (PLs). These are defined as contiguous groups of atoms that have interaction only with atoms of the immediately adjacent PLs. In practice these layers must contain a sufficient number of atoms in order to ensure that the hamiltonian and overlap interactions vanish before reaching the second nearest neighbour PLs (or can be considered negligible). The subdivision of the system into PLs is essential for the definition of the two contacts and becomes useful in the computation of all Green’s functions that exploit a recursive algorithm (See [PPD2008]).

The subdivision of the structure into layers

Subdivision of a general structure into principal layers (PL), two for each contact and an arbitrary number within the extended molecule region.

As shown in Figure 22, the extended device molecule can contain an arbitrary number of PLs (>=1), but the layers themselves must follow a sequential ordering. The ordering of the PLs follow directly from the spatial ordering of the atoms in the DFTB+ structure.

Typically it is convenient to create the structure and then sort the atom along the transport direction before then partitioning up the system. In other cases, for example when nanowires are constructed, it is more convenient to repeat a PL unit for the desired length of the extended molecule.

The (perfect) contacts are defined by two principal layers. Unlike the layers of the extended molecule region, the PLs defining the contacts must follow additional rules:

  1. The two principal layers of a given contact must be identical.
  2. They must be rigidly shifted images of each other.
  3. The first PL must be the closest to the device region.
Ordering the atoms in the system

In order to ensure that the above prescriptions are met the numbering of the atoms must follow a precise ordering.

The atoms of the central region must be specified first in the structure, i.e. before the atoms of the contacts (see Partitioning the system and contacts for details). The atoms of each of the contacts then follow those of the central region in turn. The atoms of each contact must be grouped together sequentially (You specify either all atoms of the left contact first, and then those of the right one, or vice versa). The numbering field for the atoms in the gen format input is ignored (see Geometry for details), it is the order of the atoms in the structure that matters. We often find it useful for readability to number atoms in the device (or its principal layers) sequentially, then restart the count for each contact.

The numbering of the atoms within the first principal layer of a contact is arbitrary, but the same ordering of atoms must be applied to the second PL of that contact. Thus, the order that the atoms are listed in each of the two contact PLs must be the same and their location in the list must differ by the same amount for all atoms in the two PLS. The coordinates of the equivalent atoms in the two layers must also differ by the same vector (the contact vector) which translates all atoms of the first PL onto their equivalents in the second PL. These prescriptions are checked by DFTB+ and an error message is issued if the two PLs do not conform to these requirements. It is important to remember to number the atoms of the first PL to be the layer closer to the device, i.e. they must contain the atoms with the lowest indices in the contact.

atomic_numbering

Example for numbering of the atoms. Those in the device have the lowest indices, followed by the atoms of each contact, respectively. The two principal layers making up each contacts are shifted copies of each other (both in space and order of numbering).

Supercell structures

DFTB+ can compute transport for structures that have periodicity in one or both directions that are transverse with respect to the transport direction. In this case the structure must be defined as a supercell and the rules listed above apply to the resulting cell. The real-space Poisson solver of DFTB+ limits the supercell lattices to being of orthorombic type (orthogonal vectors parallel to the cartesian axes, i.e. all angles between supercell vectors must be 90 degrees). In fact the supercell is always defined as being 3-dimensional, and the user should ensure that the dummy lattice vector along the transport direction is long enough to avoid superpositions between atoms in images along that direction.

The current version of DFTB+ only supports one supercell definition for the entire system, including central region and contacts. It may be redundant to observe that in this case the two contacts must be of the same periodicity in the directions transverse to the transport direction.

Transport Block

The geometry must be defined in the transport block, as specified in the following example:

Transport {
  Device {
    # Device is the 1st to 24th atom in the geometry list
    AtomRange = 1 24
  }
  Contact {
    Id = "source"
    # This contact starts at atom 25
    AtomRange = 25 44
  }
  Contact {
    Id = "drain"
    # This contact starts at atom 45
    AtomRange = 45 58
  }
}

Setup Geometry utility

[Input: recipes/transport/setup-geometry/]

The DFTB+ distribution includes a tool to help prepare structures for transport calculations in accordance with the requirements described in Specifying the geometry. Here we give an example that works most easily with the help of the tool jmol viewer. The example discussed here corresponds to the molecular junction that will be presented in the Example: Molecular Junction section.

Input file

The setupgeom tool works by reading in a minimal input file in hsd format called setup_in.hsd which looks like the following:

Geometry = GenFormat{
  <<< "initial.gen" # name of input geometry
}
Transport{
  Contact{
    Id = source
    Atoms = {1:24}
    ContactVector [Angstrom] = 4.94 0.0 0.0
    NumPLsDefined = 1
  }
  Contact{
    Id = drain
    NumPLsDefined = 1
    Atoms = {120:143}
    ContactVector [Angstrom] = 4.94 0.0 0.0
  }
  Task=SetupGeometry{}
}

The most significant keywords are

  • Id the label for the contact.
  • Atoms specifying all atoms belonging to each contact.
  • ContactVector it is also necessary to specify the contact supercell vectors Note that contacts must extend along only either the x, y or z directions.
  • SetupGeometry This is the Task that must be invoked in order to perform the geometry preparation for transport.
  • SpecifiedPLs. It is used to specify how many contact PLs are provided. Possible values are 1 or 2. See Partitioning the system and contacts for details.

Selecting atoms using jmol

Any suitable external tool can be used for identifying the atoms in the contacts (or a manual selection can also be made). Here we will use jmol, but unfortunately this program does not recognise gen files. So in this case you will first have to convert the starting geometry into a format readable by jmol. Usually xyz is best to this purpose. Use gen2xyz (or any other conversion tool such as babel or ASE).

Open the structure in jmol and select the atoms of the first contact. If the contact already contains two PLs, select both. At this stage it is likely that the two PLs are not following the correct numbering, namely they are not two shifted exact copies of each other. No worries! SetupGeometry will reorder the atom indices of the two PLs in the correct way. In some cases you might have only one PL per contact. The tool can then be told to duplicate this single PL, as required by the input geometry. If this is needed add the keyword NumPLsDefined = 1 to the relevant Contact block(s).

Screenshot showing the selected atoms of contact 1

Jmol screenshot showing the selected contact atoms

In Figure 24 it is possible to see the geometry to be processed for reordering with selected atoms belonging to the first contact.

Different strategies can be used to select the contact atoms in jmol. The easiest is probably using the select tool and use the mouse (see Figure 25). Orient the molecule and use the select tool by holding SHIFT + LEFT Mouse Button, then drag the mouse to include all contact atoms (see the mouse section of the jmol wiki).

Screenshot of Jmol showing the selection tool

Jmol screenshot showing the selection tool

NOTE: Initially, when you click on the selection tool, all atoms will be selected and will appear highlighted. You can either

  • Unselect all atoms by drawing a box around the whole structure with SHIFT + LEFT MOUSE

  • Choose the menu Display -> Select -> none to unselect all atoms.

  • Alternatively, open the Jmol Script Console and type:

    $ select none
    

Now you can select the contact atoms and then list the selected atoms by typing into the jmol console:

$ print {selected}

In this example you will then see:

({45:60 69:84 93:108})

The selected atoms are shown in a compact syntax that can be directly copy-pasted into setup_in.hsd. NOTE that this jmol command shows atom numbers starting from 0 and not from 1. In this case use the following syntax should be used in the setup_in.hsd input file:

Atoms [zeroBased] = {45:60 69:84 93:108}

where the modifier zeroBased tells the code that atom indices start counting from 0. Then repeat a similar process for the other contact.

The ContactVector specification is needed so the code can understand the direction of the contact and the supercell periodicity. Use the measurements tool of jmol in order to get the vector length (See Figure 24).

The user should provide the Slater-Koster files so the code can elaborate the correct cutoff distances. These are specified in the same way as for the DFTB+ code:

Task = SetupGeometry{
  SlaterKosterFiles = Type2FileNames{
     prefix =  "PATH/"
     separator  = "-"
     suffix  = ".skf"
   }
}

The following behaviour is relevant.:

* ``SpecifiedPLs = 2``: In this case `setupgeom` reorders the second PL and
  checks that the distance between second-neighbour PLs is larger than the
  cutoff. An error is shown if this is not the case.

* ``SpecifiedPLs = 1``: In this case `setupgeom` builds as many additional PLs
  as needed to fulfil the contact requirements. This can produces thicker
  contacts with two new revised PLs.

In both cases the device region is further layered into PLs for the efficient iterative Green’s function algorithm. In most cases the SK tables have a rather large cutoff, extending as long as all Hamiltonian matrix elements are below 1e-5 a.u. (about 1 meV). In order to make transport calculations a little faster it is possible to slightly shorten the SK cutoffs. A small decrease easily results in PLs with half of the original number of atoms and hence faster calculations, with very small effect on the final results (e.g., transmission, ldos, currents). The SK cutoff can be set with the block TruncateSKRange (also see the DFTB+ manual):

Transport{
  Task = SetupGeometry{
      TruncateSKRange = {
         SKMaxDistance [AA] = 5.0
         HardCutOff = Yes
      }
  }
}

Clearly in doing this, accuracy is traded for speed. In the case of C-C interactions, the parameters have a cutoff distance of about 5.17 Angstrom that is quite comparable with a reduced cutoff of 5.0 Angstrom. In any case the user should check and validate the results of selecting this option.

Once the input is ready, convert the structure to your preferred input file (initial.gen in this example) and run setupgeom. As output you will find the structure processed.gen prepared for transport calculations and a file transport.hsd containing the Transport block needed for the following contact calculations:

Transport{
  Device{
    FirstLayerAtoms = { 1 25 40 50 60 76 }
    AtomRange = 1 95
  }
  Contact{
    AtomRange = 96 143
  }
  Contact{
    AtomRange = 144 191
  }
}

The file is formatted such that it can be appended or included into the end of the dftb_in.hsd input.

For consistency, the user should specify exactly the same SKMaxDistance that was used in setting up the geometry inside the input file of DFTB+ (if it is modified from the default set by the Slater-Koster files).

Electron transport calculations in armchair nanoribbons

In this chapter we will learn how to set up self-consistent and non-self-consistent simulations with open boundary conditions by using DFTB+. We will then calculate the density of states and the transmission coefficients and analyse the results in comparison with the previous periodic calculations.

Non-SCC pristine armchair nanoribbon

[Input: recipes/transport/carbon2d-trans/agr-nonscc/ideal/]

Preparing the structure

When we run a transport calculation with open boundary conditions, the geometry specified in the input needs to obey some rules (see Specifying the geometry). The system must consist of an extended device (or molecule) region, and two or more semi-infinite bulk contacts. The bulk contacts are described by providing two principal layers for each contact.

A Principal Layer (PL) is defined as a contiguous group of atoms that have finite interaction only with atoms belonging to adjacent PLs. In a sense, a PL is a generalisation of the idea of nearest neighbour atoms to the idea of nearest neighbour blocks. The PL partitioning in the electrodes is used by the code to retrieve a description of the bulk system. PLs may be defined, as we will see, in the extended device region to take advantage of the iterative Green’s function solver algorithm.

Additional information about the definition of PLs, contacts and extended device region can be found in the manual.

In the case of an ideal one-dimensional system, all the PLs are identical. The system we start with is an infinite armchair graphene nanoribbon (AGR), therefore the partitioning into device and contact regions is somewhat arbitrary. We will therefore start from a structure file containing a single PL (2cell_7.gen), which has been previously relaxed. The PL can be converted to XYZ format by using the gen2xyz script and visualised with Jmol:

gen2xyz 2cell_7.gen
jmol 2cell_7.xyz

The structure is shown in Figure 26

Armchair nanoribbon PL

Armchair nanoribbon principal layer (PL)

As you may notice, we did not take a single unit cell length as a PL, but rather two unit cells of the graphene conventional unit cell. This choice is dictated by the definition of the PL itself, as we want to avoid non-zero interactions between second-neighbour PLs. This is better explained by referring to Figure Layer definition. The red carbon atoms represent the closest atoms which would belong to non-nearest neighbour PLs, and these have a separation of 0.568 nm, as shown in Figure Layer definition. The carbon-carbon interaction is non-zero up to a distance of 6 a.u., therefore the interaction between the two red atoms would be small but non-zero. Hence this is too small a separation for a one unit cell long section of nanoribbon to be used as the PL.

Layer definition

Layer definition

In this case the PL must contain two unit cells, in this case, as shown in figure Layer definition. It follows that the correct definition of a PL depends both on the geometry of the system and the interaction cut-off distance as defined in the SK files (In the first line of the SK-files this is given as the grid spacing in atomic units and the number of grid points in the file). The cutoff distance can be shortened slightly using the option TruncateSKRange in the Hamiltonian section, however users should be aware that this impacts the electronic properties of the system, hence should be used by experts only.

After having defined a proper PL, we then build a structure consisting of a device region with 2 PLs and contacts at each end of this region, each consisting of 2 PLs.

Note: For the pristine system, intensive properties in equilibrium calculations should not depend on the length of the device region, as the represented system is an infinite ideal nanoribbon with discrete translational symmetry along the ribbon.

The input atomic structure must be defined according to a specific ordering: the device atoms come first, then each contact is specified, starting with the PL closer to the device region. For an ideal system defined by repetition of identical PLs, the tool buildwire (distributed with the DFTB+ code) can be used to build geometries with the right ordering.

When you type:

buildwire 2cell_7.gen 3 2

the code use the geometry contained in the input supercell (2cell_7.gen), assuming direction 3 (=z) is the transport direction and that the number of principal layers in the device region will set this to be 2.

The code buildwire then will produce the correct transport block for dftb_in.hsd:

Transport{
  Device{
    AtomRange = 1 136
    FirstLayerAtoms = 1 69
  }
  Contact{
    Id = "source"
    AtomRange = 137 272
  }
  Contact{
    Id = "drain"
    AtomRange = 273 408
  }
  Task= contactHamiltonian{
    contactId = "source"
  }
}

A file Ordered_2cell_7.gen will have been created, which we will rename device_7.gen using the following:

mv Ordered_2cell_7.gen device_7.gen

We can better understand the ordering of the atomic indexes if we convert this structure to XYZ, open it with jmol and then change the colours of specific ranges of atoms by using the following syntax in the jmol console (for example, we select here the first contact and split it into two sub-ranges containing its first and second PLs):

> select atomno>136 && atomno<205
> color yellow
> select atomno>204 && atomno<273
> color red

In Figure 28 a Jmol export of the structure is shown.

PLs in contact 1

The PLs of contact 1

The yellow and red atoms represent the first and second PLs of the first contact. When you build a structure yourself, it is always a good idea to use a visualiser and verify that the atomic indices are consistent with the transport setup definitions.

The last step is to make sure the structure is defined as a cluster. From the point of view of an open boundary condition calculation, supercells (S) and clusters (C) have slightly different meanings compare with canonical DFTB calculations. By supercell we mean any structure which is periodic in any direction transverse to the transport direction, while for cluster we mean any structure not periodic in any direction transverse to transport. It follows that purely 1D systems, like nanowires and nanoribbons, should be regarded as clusters (C). Therefore we edit the structure file device_7.gen, changing in the first line the S (supercell) to be C (cluster) and remove the last four lines, which would normally only be defined for periodic systems. The newest versions of buildwire should automatically do this. The corrected definition for the 1D ribbon with open boundary conditions is then:

408  C
C    H
  1    1     37.831463060000    -20.000000000000      0.710000000000
  2    1     39.061219140000    -20.000000000000      1.420000000000
  3    1     39.061219140000    -20.000000000000      2.840000000000
  4    1     37.831463060000    -20.000000000000      3.550000000000
  5    1     35.371950920000    -20.000000000000      0.710000000000
  6    1     36.601706990000    -20.000000000000      1.420000000000
  7    1     36.601706990000    -20.000000000000      2.840000000000
  8    1     35.371950920000    -20.000000000000      3.550000000000
  ........
  65    2     20.880312110000    -20.000000000000    -11.870830122700
  66    2     20.880312110000    -20.000000000000     -9.429169877000
  67    2     40.025607920000    -20.000000000000    -11.870893735700
  68    2     40.025607920000    -20.000000000000     -9.429106264000

Now the file device_7.gen contains the correct structure, defined as a cluster and with the proper atom ordering. Next, we set up the input file for a tunnelling calculation.

Transmission and density of states

In the following we will first set up the simplest open boundary condition calculation: transmission coefficients according to the Landauer-Caroli formula, assuming a non-SCC DFTB hamiltonian. We will discuss and comment the different sections contained in the file dftb_in.hsd.

First, we have the specification of the geometry:

Geometry = GenFormat {
<<< 'device_7.gen'
}

This follows the same rule as in a regular DFTB+ calculation, except for the fact that the structure should follow the specific partitioning structure explained in the previous section.

Whenever an open boundary system is defined, we have to specify a block named Transport which contains information on the system partitioning and additional information about the contacts to the device:

Transport {
  Device {
    AtomRange = 1 136
    FirstLayerAtoms =  1 69
  }
  Contact {
    Id = "source"
    AtomRange = 137 272
    FermiLevel [eV] = -4.7103
    potential [eV] = 0.0
  }
  Contact {
    Id = "drain"
    AtomRange = 273 408
    FermiLevel [eV] = -4.7103
    potential [eV] = 0.0
  }
}

Here we have used the indexes printed by buildwire. Device contains two fields: AtomRange specifies which atoms belong to the extended device region (1 to 136) and FirstLayerAtoms specify the starting index of the PLs in the device region. This field is optional, but if not specified the iterative algorithm will not be applied and the calculation will be slower, even though the result will be still correct. Then we have the definitions of the contacts. In this example we define a two terminal system, but in general N contacts are allowed. A contact is defined by an Id (mandatory), the range of atoms belonging to the contact specified in AtomRange (mandatory) and a FermiLevel (mandatory). The potential is set by default to 0.0, therefore need not be specified in this example.

Note that in non-SCC calculations that do not compute the Density Matrix of the system, the Fermi level and the contact potential are not necessary to calculate a transmission curve, but they are needed to calculate the current via the Landauer formula, as they would determine the occupation distribution in the contacts.

Then we have the Hamiltonian block, describing how the initial Hamiltonian and the SCC component, if any, will be calculated:

Hamiltonian = DFTB {
  SCC = No
  MaxAngularMomentum {
    C = "p"
    H = "s"
  }
  SlaterKosterFiles = Type2FileNames {
    Prefix = "../../slako/"
    Separator = "-"
    Suffix = ".skf"
  }
  Solver = TransportOnly{}
}

In this example we will calculate the transmission according to the Caroli (referred by some authors as the Fisher Lee) formula in a non-SCC approximation, i.e. the Hamiltonian is directly assembled from the Slater-Koster files and used “as is” to build the contact self energies and the extended device Green function. The use of an eigensolver is not meaningful in an open boundary setup, as the system is instead solved by the Green function technique. Therefore we just use a keyword TransportOnly to indicate that we do not want to solve an eigenvalue problem. The other fields are filled up in the same way as for a regular DFTB calculation.

Usually in DFTB+ an eigensolver is regarded as a calculator which can provide the charge density in the SCC cycle, therefore we will instead define a Green’s function based solver later, but only for SCC calculations.

Note that as C-H bonds are present in the system, charge transfer should occur, hence the result will not be accurate at the non-SCC level. It is not a-priori trivial to predict whether this affects qualitatively or quantitatively the transmission. We will therefore later compare these results with an SCC calculation - at the moment we will stay at the level of a non-SCC calculation, because it is faster to execute and also allows us to use the simplest input file possible.

Finally, the implementation of the Landauer-Caroli formula is regarded as a post-processing operation and specified by the block TunnelingAndDos inside Analysis:

Analysis {
  TunnelingAndDos {
    Verbosity = 101
    EnergyRange [eV] = -6.5  -3.0
    EnergyStep [eV] = 0.01
    Region {
      Atoms = 1:136
    }
  }
}

TunnelingAndDos allows for the calculation of Transmission coefficient, Local Density of States (LDOS) and current. A transmission is always calculated using the energy interval and energy step specified here. The LDOS is only calculated when sub-blocks Region are defined. Region can be used to select some specific subsets of atoms or orbitals, according to the syntax explained in the manual. In this example, we are specifying the whole extended device region (atoms 1 to 136). Note that the energy range of interest is not known a-priori. Either you have a reference band structure calculation so therefore know where the first sub-bands are (the correct way to do this), or you can run a quick calculation with a large energy step and on the basis of the transmission curve then refine the range of interest.

We can then start the calculation:

dftb+ dftb_in.hsd | tee output.log
Parallelism in transport calculations

Please have have a look at the section Parallel usage of DFTB+ for a general discussion. In transport calculations we can take advantage of parallelisation over the energy points by running the (mpi enabled) code with mpirun:

mpirun -n 4 dftb+ dftb_in.hsd | tee output.log

where 4 should be substituted by the number of available nodes. Note that non-equilibrium Green’s function (NEGF) calculations are parallelised over energy points, therefore a number of nodes larger than the energy grid will not improve performances and secondly that the memory consumption is proportional to the number of nodes used - this may be critical in shared memory systems with a small amount of memory per node. SMP parallelism is also used via OpenMP multi-threading, this is exploited at low level linear algebra numerics such as Matrix-Matrix multiplications and Matrix inversions, especially when linking to libraries such as the Intel MKL. Multi-threading is not enabled by default in DFTB+ since this can easily collide with the parallel SCALAPACK diagonaliser. In order to enable OMP calculations you should explicitly look for Parallel block:

Parallel{
  UseOmpThreads = Yes
}

Some experimentation can be done in order to find the optimal combination of MPI and OpenMP. Clearly the two schemes should not overlap on the same CPU(s). For instance in the early days Xeon Quad Cores CPUs the best performances could be obtained by running OpenMP with maximum of 4 threads (OMP_NUM_THREADS=4) and MPI across as many separate nodes or sockets as available. As the number of cores on each socket has increased to 8, 10 or more, the most efficient balance has to be determined by testing. Typically openMP does not scale quite linearly but tend to saturate at about 4 threads for typical transport calculations. Therefore it seems optimal to divide the total number of available cores modulo 4 threads to get the number of MPI processes. So, for instance with 64 cores spread on 4 units with 2 sockets of 8 cores each, it should be fine to set OMP_NUM_THREADS=4 and 16 MPI processes.

Plotting Transmission and DOS

When the calculation has finished, the transmission and density of states are saved to separate transmission.dat and localDOS.dat files. These additional files both contain the energy points in the first column and the desired quantities as additional columns.

We can plot the transmission by using the plotxy script:

plotxy --xlabel 'Energy [eV]' --ylabel 'Transmission' -L transmission.dat

The plot is shown in Figure 29:

Non-SCC transmission in pristine AGR

Non-SCC transmission through a pristine AGR

The ribbon is semiconducting, therefore we can see a zero transmission at energies corresponding to the band gap. As the system is ideal, outside of the band gap we can observe the characteristic conductance steps where the value of the transmission is 1.0 for every band which crosses a given energy. This is a normal signature of ideal 1D systems with translational invariance.

Similarly, we can visualise the density of states by typing (the x and y axis limits are chosen to focus on the first few sub-bands):

plotxy --xlabel 'Energy [eV]' --ylabel 'DOS [arbitrary units]' -L \
--xlimits -6.5 -3 --ylimit 0 1400 localDOS.dat

The result is shown in Figure 30:

Non-SCC density of states in pristine AGR

Non-SCC density of states for a pristine AGR

You can plot the transmission or the density of states on a semi-logarithmic scale:

plotxy --xlabel 'Energy [eV]' --ylabel 'Transmission' -L \
--xlimits -6.5 -3 --logscale y localDOS.dat

If you do so, you will obtain the plot shown in Figure Non-SCC density of states on logarithmic scale.

Non-SCC density of states in logarithmic scale

Non-SCC density of states on logarithmic scale

The density of states inside the band-gap is not zero, but decreases by several orders of magnitude. This is a natural consequence of the quasi-particle nature of the Green’s function formalism: every state in the system has a finite broadening in energy.

Non-SCC armchair nanoribbon with vacancy (A)

[Input: recipes/transport/carbon2d-trans/agr-nonscc/vacancy1/]

Transmission and Density of States

Now that we have a calculation of the reference pristine system, we will introduce a scattering centre by producing a vacancy in the system. In order to do so, we directly modify the structure file device_7.gen and the input file dftb_in.hsd. We remove atom number 48 from the structure file. Note that DFTB+ ignores the indexes in the first column of the .gen file, therefore we do not need to adjust them. We have, however, to remember to change the total number of atoms in the first line from 408 to 407:

407  C
C    H
1    1     37.831463060000    -20.000000000000      0.710000000000
2    1     39.061219140000    -20.000000000000      1.420000000000
3    1     39.061219140000    -20.000000000000      2.840000000000
.....
46    1     32.912438770000    -20.000000000000      7.810000000000
47    1     30.452926620000    -20.000000000000      4.970000000000
49    1     31.682682700000    -20.000000000000      7.100000000000
50    1     30.452926620000    -20.000000000000      7.810000000000
...

The resulting structure should look like this:

Geometry with vacancy on sublattice A

Geometry with vacancy on sublattice A

We then also adjust the dftb_in.hsd file accordingly. As we have removed an atom, all the indexes in the transport block need to be adjusted properly. Note that we removed an atom in the first PL of the extended device, therefore we also need to adjust the values of FirstLayerAtoms. The Transport block now reads:

Transport {
    Device {
      AtomRange = 1 135
      FirstLayerAtoms =  1 68
    }
    Contact {
      Id = "source"
      AtomRange = 136 271
      FermiLevel [eV] = -4.7103
      potential [eV] = 0.0
    }
    Contact {
      Id = "drain"
      AtomRange = 272 407
      FermiLevel [eV] = -4.7103
      potential [eV] = 0.0
    }
}

Compared to the pristine system, we have modified AtomRange in all the blocks and the values of FirstLayerAtoms.

After running the calculation, we can compare the transmission curve for this structure with a single vacancy and the pristine ribbon by using plotxy:

plotxy --xlabel 'Energy [eV]' --ylabel 'Transmission' -L --xlimits -6.5 -3 \
transmission.dat ../ideal/transmission.dat
Non-SCC Transmission in pristine (green) and single vacancy (blue) ribbon

Non-SCC Transmission in pristine (green) and single vacancy (blue) ribbons

Clearly, the presence of a vacancy introduces some finite scattering which reduce the transmission with respect to the ideal ribbon. In particular, the effect is quite small in the first conductance band while it is more visible in the first valence band and in higher bands. The reflection amplitude is increased near the band edges. This is expected in 1D systems, as near the band edges the density of states diverges (Van Hove singularities), hence the group velocity is lower, and it is known from semi-classical transport theory that the scattering probability is, when short range disorder is present, inversely proportional to the group velocity. The absence of resonant features in the transmission may point to the fact that the vacancy does not induce additional states in the conduction or valence bands. This can be verified by visualising the density of states, as in Figure Non-SCC DOS for single vacancy in sublattice A (linear scale).

Non-SCC DOS for single vacancy in sublattice A (linear scale)

Non-SCC DOS for single vacancy in sublattice A (linear scale)

The same density of states can be visualised on logarithmic scale as well, as in Figure 35.

non-SCC DOS for single vacancy on sublattice A (semilog scale)

Non-SCC DOS for single vacancy on sublattice A (semilog scale)

The vacancy is adding some close energy levels in the gap, as verified already using a conventional DFTB+ calculation (Armchair nanoribbon with vacancy). The Van Hove singularities are partially suppressed as the system no longer possesses translational symmetry along the transport direction. Even in a simple non-SCC approximation, the qualitative picture is consistent with the previous SCC periodic calculation. We will now consider a vacancy sitting on the other sublattice (B) and try to understand whether the relative position of the vacancy is relevant or not by calculating once more the non-SCC transmission and density of states.

Non-SCC armchair nanoribbon with vacancy (B)

[Input: recipes/transport/carbon2d-trans/agr-nonscc/vacancy2/]

Transmission and Density of States

We will now consider a vacancy sitting on the other sublattice (B), i.e. we can take the structure file we used for the ideal ribbon and instead delete the atom number 47. The structure file is:

407  C
C    H
1    1     37.831463060000    -20.000000000000      0.710000000000
2    1     39.061219140000    -20.000000000000      1.420000000000
3    1     39.061219140000    -20.000000000000      2.840000000000
.....
46    1     32.912438770000    -20.000000000000      7.810000000000
48    1     31.682682700000    -20.000000000000      5.680000000000
49    1     31.682682700000    -20.000000000000      7.100000000000
50    1     30.452926620000    -20.000000000000      7.810000000000
.....

The jmol rendering of the geometry:

Geometry with vacancy on sublattice B

Geometry with vacancy on sublattice B

Also in this case we remove an atom from the first PL of the extended device region, therefore the rest of the dftb_in.hsd input file is identical to the one we used for the vacancy on sublattice A. We can therefore just copy it and run the DFTB calculation. The transmission is shown in Figure Non-SCC Transmission for vacancy B (blue), pristine (green) and vacancy A (green) (transmission for vacancy on sublattice B in blue, transmission for vacancy on sublattice A in green and pristine system in green):

Non-SCC Transmission for vacancy B (blue), pristine (green) and vacancy A (green)

Non-SCC Transmission for vacancy B (blue), pristine (green) and vacancy A (green)

We can see a very strong suppression of transmission in the first sub-bands, especially in the first valence band. Again, the absence of resonances may be due by gap states. In fact, we can verify it by plotting the density of states, as shown in Figure 38.

Non-SCC DOS for vacancy in sublattice B

Non-SCC DOS for vacancy in sublattice B

We can clearly see that the vacancy induces some nearly degenerate gap states, and that the density of states at higher energies is largely unaffected. It is known that the relative position of a scattering centre in a graphene nanoribbon with respect to different sub-lattices strongly affects its transport properties, as is shown in these non-SCC calculation. Qualitatively, the picture is also consistent with periodic DFTB+ calculations, with the difference that we directly obtain information on the effect on transport properties via transmission function. This also ensures that we do not have to worry about choosing the right supercell or k-point sampling as the open boundary conditions represent exactly the infinite system with a single scattering centre. As already pointed out earlier, there is no warranty that a non-SCC calculation will give the proper result in a system if relevant charge transfer is occurring, and in general it will not. Therefore in the next section we will repeat the same calculation by solving the SCC problem.

SCC Pristine armchair nanoribbon

A DFTB Hamiltonian is in general given by two terms:

\[H^{SCC} = H^{0} + H^{\text{shift}}\]

Where the component \(H^{\text{shift}}\) is the self-consistent (SCC) correction. The SCC correction is in general needed whenever there is a finite charge transfer between atoms, i.e. whenever there are bonds between atoms with different chemical species or with different coordination numbers. In our case, we can expect a finite charge transfer between the C and H atoms at the edges, and an SCC component may be relevant due to this charge transfer. While in the previous sections, we have only considered the non-SCC component \(H^{0}\), in the next sections we will compute the same calculation by including the correction given by the shifts \(H^{\text{shift}}\).

Note that the equilibrium SCC problem can be tackled in two ways: we could apply the Landauer-Caroli to an SCC Hamiltonian taken, for example, from a periodic calculation (i.e. uploading the SCC component), or we can solve the problem as a full NEGF setup with 0 bias. The code flow is currently such that this second procedure has to be used (however, the first technique will be available in future release). Therefore we will need to learn to set up the input related to two other components of the NEGF machinery: the real space Poisson solver and the Green’s function solution of the density matrix.

In this way we will introduce a first complete input file. It is important, from a didactic point of view, to be clear that as long as the applied bias is zero and we are interested in equilibrium properties, the two approaches are equivalent and the results are only valid in the limit of linear response.

Contact calculation

[Input: recipes/transport/carbon2d-trans/agr-scc/contacts/]

In order to run an SCC transport calculation, the code needs some additional knowledge about the contact PLs. In particular, the SCC shifts and Mulliken charges have to be saved somewhere to enable consistency between the calculation of the self-energy and the calculation of the Poisson potential. To this end, we have to introduce an additional step in the procedure: the contact calculation.

The contact calculation is simply a periodic calculation for the contact PL. As such, not all the field defined in the transport are meaningful and the input file will of course look different. The Geometry block is identical:

Geometry = GenFormat {
<<< 'device_7.gen'
}

While the Transport block needs to be modified as follows:

Transport {
    Device {
      AtomRange = 1 136
    }
    Contact {
      Id = "source"
      AtomRange = 137 272
    }
    Contact {
      Id = "drain"
      AtomRange = 273 408
    }
  Task = ContactHamiltonian {
     ContactId = "source"
  }
}

We first notice the addition of an option Task =ContactHamiltonian {...}, which was previously absent. This block specifies that we intend to calculate the bulk contact SCC properties, and the field ContactId specifies which contact we want to calculate. The field FirstLayerAtoms in the Device block is absent (it does not make sense in a contact calculation) and so are the fields FermiLevel and Potential in the two Contact sections, as they are not meaningful during this step. In general, the philosophy of a DFTB+ input file is that if input fields that would be useless or contradictory are present, the code will halt with an error message.

The Hamiltonian block shows some differences, too:

Hamiltonian = DFTB {
  SCC = Yes
  SCCTolerance = 1e-6
  EwaldParameter = 0.1
  MaxAngularMomentum {
    C = "p"
    H = "s"
  }

  SlaterKosterFiles = Type2FileNames {
    Prefix = "../../slako/"
    Separator = "-"
    Suffix = ".skf"
  }

  KPointsAndWeights = SupercellFolding {
    25 0 0
    0 1 0
    0 0 1
    0.0 0.0 0.0
  }
}

The flags SCC = Yes and SCCTolerance = 1e-6 enable the SCC calculation. A tight tolerance in the contact calculation, and in general in transport calculations, helps to avoid artificial mismatches at device/contact boundaries. The other parameters are the usual ones, except for the KPointsAndWeights, which deserves special attention.

The bulk contact is of course a periodic structure, hence we need to specify a proper k-point sampling, as we would do in a regular periodic DFTB calculation. However, you should be careful about the way the lattice vector is internally defined. When the input system is a cluster (C), i.e. it has no periodicity in directions transverse to the transport direction(s), the lattice vector of the contact is internally reconstructed and assigned to be the first lattice vector, regardless the spatial orientation of the structure. This means that the KPointsAndWeights for a cluster system are always defined as above: a finite number of k-points along the first reciprocal vector (according to a 1D Monkhorst-Pack scheme) and sampling with values of 0 along the other two directions. The reason for this choice is that we do not want to assign a specific direction to the structures, i.e. at this level we do not assume in any way that the structure must be oriented along x,y or z direction.

Note also that as the contact information is used in the transport calculation, it is a good idea to use a dense k point sampling and a low SCC tolerance, in order to get a very well converged solution. The contact calculation will be usually much faster than the transport calculation, so this does not usually present a problem.

On the other hand, this rule regarding k-points does not apply to periodic transport calculations, as the periodicity along the transverse directions must also be preserved (refer to the following section for a periodic system example). We can run the calculation by typing:

dftb+ dftb_in.hsd | tee output.log

After running the calculation, we notice that a file shiftcont_source.dat is generated. This file contains the information useful for the transport calculation (shifts and charges of a bulk contact). It is suggested you also keep a copy of the detailed.out for later reference. We can obtain the value of the Fermi energy, which we will later need, from detailed.out as -4.7103 eV (this is also stored in the “shiftcont_*.dat” files).

We can now run the same calculation for the drain contact by just modifying the Task block:

Task = ContactHamiltonian {
     ContactId = "drain"
  }

The contact are identical, therefore we expect the same results, also with the same Fermi energy. We now have a file shiftcont_drain.out, which is equivalent to shiftcont_drain.dat apart from small numerical error. In fact, we could have simply copied the previous contact results into this file.

Now that the contact calculation is available, we can set up the transport calculation.

Transmission and Density of States

[Input: recipes/transport/carbon2d-trans/agr-scc/ideal/]

In order to calculate the transmission for the SCC system, we have to copy the files shiftcont_drain.dat and shiftcont_source.dat into the current directory:

cp ../contacts/shiftcont* .

Then we have to specify some additional blocks with respect to a non-SCC transport calculation. We first look at the Transport block itself:

Transport {
  Device {
    AtomRange = 1 136
    FirstLayerAtoms =  1 69
  }
  Contact {
    Id = "source"
    AtomRange = 137 272
    FermiLevel [eV] = -4.45
    potential [eV] = 0.0
  }
  Contact {
    Id = "drain"
    AtomRange = 273 408
    FermiLevel [eV] = -4.45
    potential [eV] = 0.0
  }
  Task = UploadContacts {
  }
}

The atom indices are of course the same, as the geometry of the system is not changed. This time though, we explicitly specified a Task block named UploadContacts, which declares that we are now running a full transport calculation. Task = UploadContacts {} is the default and does not take any additional parameters, therefore you can safely omit it.

Now that we are solving the full SCC scheme, we will allow for charge transfer between the open leads and the extended device region, therefore it is important to set a well-defined Fermi energy. While this does not make any difference in a non-SCC transmission calculation, it is crucial for the SCC calculation. A wrong or unphysical Fermi energy will lead to incorrect charge accumulation or depletion of the system.

To this end, you will have to pay some attention to the definition of the Fermi energy. As we are calculating a semiconductor system, the Fermi level should be in the energy gap. By calculating a band structure or by inspection of the eigenvalues in the file detailed.out you can verify that the value -4.7103 is on the edge of the valence band. This can be explained as numerically the Fermi level is defined by filling the single particle states till the reference density is reached. Its position inside the gap of a semiconductor is in some senses arbitrary within that range (a common convention which DFTB+ uses is to set it at the middle of the gap). Therefore, while in metallic system we may ensure consistency and use a well converged Fermi level at some specific temperature during all our transport calculation, in the case of a semiconductor system we can manually set the Fermi level in the middle of the energy gap (for this system, roughly at -4.45 eV) and freely vary the temperature as long as the gap is larger than several times the value of kT.

We will see in the following that there are some ways to verify that the Fermi level is defined consistently, as this is often source of confusion. Note also that, differently from other codes, DFTB+ allows for different Fermi levels in different contacts, which can be useful when heterogeneous contacts are defined (for example, in a PN junction). In that case a built-in potential is internally added to ensure no current flow at equilibrium.

In the Hamiltonian block now an SCC calculation has to be specified:

Hamiltonian = DFTB {
  SCC = Yes
  SCCTolerance = 1e-6
  ReadInitialCharges = No
  ...
Poisson Solver

Differently from the non-SCC calculation, we now need to specify a way to solve the Hartree potential and the charge density self-consistently. In a NEGF calculation, we use a real-space Poisson solver to calculate the potential, and a Green function integration method to calculate the density matrix:

...
Electrostatics = Poisson {
  PoissonBox [Angstrom] = 40.0 30.0 30.0
  MinimalGrid [Angstrom] = 0.5 0.5 0.5
  SavePotential = Yes
}

The Poisson section contains the definition of the real space grid parameters. Note that differently from a normal DFTB+ calculation, simulating regions of vacuum is not for free, as the simulation domain must be spanned by the real space grid. The grid is always oriented along the orthogonal cartesian coordinate system. PoissonBox specifies the lateral length of the grid. The length along the transport direction is ignored as it is automatically determined by the code (in this case, z=30.0). The length along the transverse direction are relevant and should be carefully set. In order not to force unphysical boundary conditions, you may extend the grid at least 1 nm away. If a strong charge transfer is present, you may go for a larger box, according to your available computational resources. A poorly defined grid can lead to no convergence at all, to a very strange (and slow) convergence path or to unphysical results. MinimalGrid specifies the minimum step size for the multigrid algorithm. Values between 0.2 and 0.5 are usually good, where a lower value stands for higher precision. SavePotential = Yes will return a file containing the potential and charge density profile, for later reference. These files can be quite large, therefore the default is No.

Density Matrix Calculations - GreensFunction solver

The solver is now specified as GreensFunction. With this definition, we instruct the code not to solve an eigenvalue problem but rather to calculate the density matrix by integration of the Keldysh Green function:

Solver = GreensFunction{}

This block provides the SCC charge density with or without applied bias. The options define the integration path. Usually the default options are good enough in most cases and advanced users may refer to the manual or other examples in this book.

The Mixer options is present in DFTB+ calculations as well.:

Mixer = Broyden {
  MixingParameter = 0.02
}

Convergence is known to be critical in NEGF schemes. In that case, a low MixingParameter value will help to avoid strong oscillation in the SCC iterations.

The last block is Analysis:

Analysis {
  TunnelingAndDos {
    Verbosity = 101
    EnergyRange [eV] = -6.0  -3.0
    EnergyStep [eV] = 0.01
  }
}

This block is identical to the non-SCC calculation as the same task is performed: calculation of Transmission, current and DOS by using the Landauer-Caroli formula. The Transmission will be of course be different due to the fact that the ground state charge density is now solution of the SCC Hamiltonian and we have slightly changed the energy range as the SCC component introduce a shift of the band-structure (try to compare the SCC and non-SCC transmission results when you are done). We can now run the calculation:

mpirun -n 4 dftb dftb_in.hsd | tee output.log

Where -n 4 should be adapted to the number of available nodes. As transport calculations in DFTB+ are parallelised on energy points, a quantity larger than 40 (the default number of integration points at equilibrium) will not speed up the calculation of the density matrix.

An inspection of the file detailed.out reveals that we have additional information with respect to the non-SCC calculation, including a list of atomic charges and orbital population, as now the SCC density matrix has been calculated. The transmission is also saved as separate file, and is shown in Figure 39.

SCC transmission in pristine AGR

SCC transmission in pristine AGR

As you would expect, it still step-like as in the non-SCC calculation. This is correct, as we’re calculating an ideal 1D system. The bandwidth (i.e., the steps width) may differ due to SCC contribution and the overall transmission is shifted. Note that while the non-SCC calculation is very robust, meaning that you will always get step-like transmission for a 1D system, in the SCC calculation a poor definition of the boundary conditions, of the bulk contact properties or of the additional GreensFunction and Poisson blocks may induce numerical artefacts and scattering barriers which should not be there. As a result, the transmission will not appear step-like but rather visibly smoothed out.

You can also verify the quality of the calculation by inspection of the potential and charge density profiles. In a pristine periodic system we would expect a periodic potential, without discontinuities at the boundary between extended device and electrodes. The information needed to construct the real space potential and charge density are contained in 5 files: box3d.dat, Xvector.dat, Yvector,dat, Zvector.dat, potential.dat and charge_density.dat. The first 4 files contain the grid information, and the last two ones the list of potential and charge density values (following a row major order). Those information can be converted to any useful with some simple scripting, we provide an utility called makecube which can be used to convert them to Gaussian cube format or a more flexible vtk format. There’s plenty of software to visualise vtk or cube files, but unluckily at present current choices of software which are effective at visualising real space grid data are weak at visualising atomistic structures, and vice versa. In the following we will use paraview and work with the vtk format. Paraview is freely available and is supplied with many gnu/linux distributions as a compiled package.

The vtk file can be obtained by simply running:

makecube potential.dat pot.vtk
Potential profile along the nanoribbon

Potential profile along the nanoribbon

An extensive explanation of paraview features is beyond the scope of this tutorial. Following some easy steps, you can produce the potential map shown in Figure 40.

  1. Open paraview and import the file pot.vtk from File->Open
  2. Click on Properties->Apply (Properties are usually visualised on the left side of the screen) and you should see the bounding box in the visualisation windows.
  3. In the Pipeline browser select the file pot.vtk by clicking once on it, and then select the Clip filter from Filters->Alphabetical (or from the filter toolbar).
  4. In Properties, click on ‘Y Normal’ to produce a clip along the nanoribbon.
  5. Click on Properties->Apply.

The plot shown in Figure 40 above is the self-consistent potential along the nanoribbon. We can see that the charge transfer between carbon and hydrogen at the edges results in a non-flat potential. At a first glance, the potential looks quite homogeneous, meaning that there are no clear discontinuities at the box boundary. This is important: being it a homogeneous ribbon, the potential should have the same periodicity as the lattice. We can verify this with a closer inspection by plotting a cut along a line. We apply the following steps:

  1. We select pot.vtk in the Pipeline Browser and Filters->Alphabetical->Plot Over Line
  2. From the Properties window, we select ‘Z Axis’ and click on ‘Apply’

By following this procedure we obtain Figure Potential profile along the nanoribbon.

Potential profile along the nanoribbon

Potential profile along the nanoribbon

As you can notice, there is a discontinuity at the interface. However, it is quite small (~ 12 meV). Defining a ‘perfect’ interface between the bulk semi-infinite contacts and the device region is very difficult, especially in a semiconductor where no free charge can contribute to screen such an interface potential. A smaller tolerance in the self-consistent charge during the contact and the device calculation, a finer calculation of the Fermi level (in metallic systems) and a finer Poisson grid can decrease the discontinuity: you should be able to reach about 1 meV, but it is difficult to go below this value. However, as you can see in the transmission plot, as long as the discontinuity is this small, it hardly affects the transmission.

However, it is important for you to verify that the behaviour at the boundaries is reasonable. Otherwise, the extended region may be too small to allow to the relevant physical quantities (charge, potential) to relax to bulk values. Be aware that numerical errors are unavoidable, therefore it is important to understand their relevance and the impact on the results. In the transmission calculation we do not notice anything different because the energy step is close to the mismatch at the boundaries.

After running the calculation for the pristine system, we will introduce vacancies as we did in the non-SCC calculation. The results should be now directly comparable to the bulk periodic SCC calculation.

SCC armchair nanoribbon with vacancy (A)

[Input: recipes/transport/carbon2d-trans/agr-scc/vacancy1/]

We will now calculate the SCC transmission for the nanoribbon with a vacancy on the sublattice A, using the same input structure set up for the non-SCC calculation. The contacts are identical to the pristine case, therefore in the following we will only modify the extended device calculation.

Transmission and Density of States

As previously done, the transport section must be modified in order to account for the different number of atoms in the extended device region:

Transport {
    Device {
      AtomRange = 1 135
      FirstLayerAtoms =  1 68
    }
    Contact {
      Id = "source"
      AtomRange = 136 271
      FermiLevel [eV] = -4.45
      potential [eV] = 0.0
    }
    Contact {
      Id = "drain"
      AtomRange = 272 407
      FermiLevel [eV] = -4.45
      potential [eV] = 0.0
    }
  Task = UploadContacts {
  }
}

We use the same Fermi level and the files shiftcont_source.dat and shiftcont_drain.dat as in the pristine system calculation, as the contacts are not modified.

The Hamiltonian block is also not modified, except for an additional finite temperature:

Hamiltonian = DFTB {
  ...
  Filling = Fermi {
    Temperature [Kelvin] = 150.0
  }
  ...
}

A finite temperature is used to provide a finite temperature broadening, useful if the vacancy induces partially filled gap states. In general, temperature broadening may improve convergence and dump oscillations in the SCC iterations.

The Analysis block is also similar, we add the DOS calculation to verify if we can identify a vacancy state:

Analysis {
  TunnelingAndDos {
    Verbosity = 101
    EnergyRange [eV] = -6.0  -3.0
    EnergyStep [eV] = 0.025
    Region {
      Atoms = 1:135
    }
  }
}

As usual, you can now create the GS and contacts directories, copy the shiftcont_source.dat and shiftcont_drain.dat in the current directory and run the calculation. The density of states and transmission are shown in Figure Density of states for vacancy (A) and Transmission for vacancy (A).

Density of states for vacancy (A)

Density of states for vacancy (A)

Transmission for vacancy (A)

Transmission for vacancy (A)

The vacancy states are located in the energy gap, consistently with the periodic calculation, and that the tunnelling curve is qualitative similar to the non-SCC calculation. The first conduction and valence band are weakly affected by the vacancy which does not act as a strong scatterer. There is no signature of resonances, as the additional levels are located in the gap.

Note also that we previously recommended the use of large extended regions and to verify that the potential and charge density are smooth at interfaces. As you can see in Figure 44, the impurity is very close to the boundaries, resulting to a potential profile which varies significantly close in to the boundary. It is left to the reader to verify that the overall transmission does not change significantly if a longer extended region is considered.

Potential profile for vacancy (A)

Potential profile for vacancy (A)

SCC armchair nanoribbon with vacancy (B)

[Input: recipes/transport/carbon2d-trans/agr-scc/vacancy2/]

We will now run the same calculation, but with the vacancy on the sublattice B. As in the non-SCC case, the only difference with the previous calculation is the location of the vacancy, therefore the input file is absolutely identical. The contacts are the same, therefore all we have to do is copy the shiftcont_source.dat and shiftcont_drain.dat files into the current directory and run the calculation.

The resulting transmission and density of states are shown in Figures Density of states for vacancy (B) and Transmission for vacancy (B).

Density of states for vacancy (B)

Density of states for vacancy (B)

Transmission for vacancy (B)

Transmission for vacancy (B)

We immediately notice that the Van Hove singularities are strongly suppressed and that the valence band is substantially suppressed. Consistently with the picture obtained by periodic calculation, a quasi-bounded vacancy level hybridises with the valence band edge causing strong back-scattering. A comparison between all the three cases (see Figure Transmission for pristine system (blue), vacancy (A) (green) and vacancy (B) (red)) shows that the scattering probability is strongly affected by the exact position of the vacancy. This is generally true for other kinds of short range scattering centres in graphene nanoribbons, such as substitutional impurities. We can also notice that in this particular case the non-SCC approximation is qualitatively consistent for two reasons: the vacancy levels are not populated and the charge transfer at the edges is not critical as the edges contribute poorly to the transmission in an armchair ribbon.

Transmission for pristine system (blue), vacancy (A) (green) and vacancy (B) (red)

Transmission for pristine system (blue), vacancy (A) (green) and vacancy (B) (red)

Example: Molecular Junction

[Input: recipes/transport/molecular-junction/]

This example guides in the definition of a contact/molecule/contact geometry for transport calculations.

Tasks 0: Preparing the structure

Before starting any calculation it is necessary to define the structure following the rules given in the section, Specifying the geometry. Here we give some tips in order to speed up the process. The following procedure is valid for systems with 2 contacts placed along the same transport direction.:

  1. It is convenient to initially define a structure without paying attention to its atom ordering. The system should be complete, with each contact comprising two PLs.
  2. Once the structure has been prepared, convert it to a simple xyz format file
  3. Reorder the atoms according to the coordinates along the transport direction. In Linux this can be achieved using sort, e.g., $ sort -n -k 2 input.xyz > output.xyz This will sort according to ‘column #2’, corresponding to the ‘x’ coordinates. Even better, the following command line will skip the first two lines $ (head -2 input.xyz && tail -n +3 input.xyz | sort -nk 2) > output.xyz
  4. Check that, after sorting, the contacts comprise of 2 PLs with exactly the same atomic ordering (see Setup Geometry utility if you only have 1 PL).
  5. Open the sorted xyz structure with a text editor and move the first contact to the end of the list of atoms. After this place the atoms of the second PL, the first PL must be closer to the device region.
  6. Reconvert back to gen format (xyz2gen) and add the supercell information, if present.
System with atoms sorted along the x-direction

System with atoms sorted along the x-direction.

Now the system is ready for transport calculations. It is possible to define PLs also in the molecular (device) region, which can be quite useful for speeding up calculations. The size of these PLs depends on the largest cutoff in the tight-binding interactions. Figure _fig_transport_junction-sort-col shows an example of PLs highlighted in different colours.

System showing the device PLs highlighted in colours

System showing the different PLs highlighted with colours.

Relaxation of the structure

Only the molecular atoms were allowed to move, namely the first 95 atoms in the list. The contact regions must be kept fixed. The definition of the two PLs, as rigid shifted copies of each other, will otherwise be lost. It is a good idea to build the contact PLs from previously relaxed bulk structures.

In setting up the example structure, we found it a bit difficult to relax at first since the SCC-loop did not converge. In such cases it is better to proceed in a sequence of different relaxation steps.:

  1. Initially the structure was optimised with a non-SCC calculation and just Gamma point sampling.
  2. More k-points where added in the direction y, parallel to transport (for this geometry the structure periodically repeats along this direction, but not along z due to the large vacuum gap for this supercell). Then re-relaxed.
  3. SCC was finally switched on and the structure re-relaxed from the final non-SCC geometry.

In order to help SCC convergence and avoid electrostatic artefacts, the dangling bonds at both contact ends have been saturated with hydrogen atoms. In other cases, if the system allows, it is possible to initially optimise a periodic supercell which repeats along the transport direction (but where any contact atoms are kept fixed). It is then possible to relax under open boundary conditions, but this can be left as a final refinement step.

Task 1: Calculation of the contacts

In the input file you need to specify the Geometry and Transport blocks for the contact calculations tasks:

Geometry = GenFormat {
<<< str.gen'
}
Transport {
  Device {
    AtomRange = 1 95
  }
  Contact {
    Id = "Source"
    AtomRange = 96 143
    PLShiftTolerance = 0.01
  }
  Contact {
    Id = "Drain"
    AtomRange = 144 191
    PLShiftTolerance = 0.01
  }
  Task = ContactHamiltonian{
     ContactId = "Source"
  }
}

Notice the tag PLShiftTolerance added here, since in this case the 2 contact PLs are not perfect rigid copies of each other. It is possible to comment out this tag and see the error message from DFTB+.

With task ContactHamiltonian DFTB+ performs a contact calculation. The code extracts, from the complete device geometry, the contact defined with Id = "Source" and constructs appropriate supercell vectors from the coordinates given in the two PLs. The rest of the input file refers to a standard DFTB+ calculation of a supercell system, for which an appropriate k-point sampling must be specified:

Hamiltonian = DFTB {
  SCC = Yes
  MaxAngularMomentum {
    C = "p"
    O = "p"
    H = "s"
  }

  SlaterKosterFiles = Type2FileNames{
    Prefix = "./mio-1-1/"
    Separator = "-"
    Suffix = ".skf"
  }

  Filling = Fermi{
    Temperature [Kelvin] = 0.0
  }

  # Appropriate for transport along x, periodic along y and a vacuum gap on z:
  KpointsAndWeights = SupercellFolding {
     16  0   0
     0   4   0
     0   0   1
     0.5 0.5 0.5
  }

}

Contact supercell

Since in this example the input geometry is defined as a supercell, DFTB+ preserves the meaningful supercell vectors. In this case the transport direction is along x and the relevant periodicity is along the lateral direction, y. The supplied supercell vector along x is dummy as the structure is open in that direction, whereas the periodicity along z defines the graphene-graphene separation. This z value does matter in the definition of the Poisson Box (see below) so cannot be made arbitrarily large. The DFTB+ code internally builds a supercell vector for the contact along the x-direction, taken from the geometry definition of the two PLs. Hence, the contact computation is performed for the two PLs of the contact and results are saved in shiftcont_source.dat for later use. For this reason it is recommended to set appropriate values for the k-point sampling in all directions. In this particular case we can set 1 k-point along z, since this is a dummy periodicity, but both x and y should be convergently sampled (note that in this example the contact region the cell is shorter along x than y so will probably require more k-points).

An appropriate k-sampling is important in order to converge the Fermi energy and contact shift calculations. It is possible to experiment by changing the number of k-points and check the convergence of the SCC-charges and Fermi energy in detailed.out.

After the “Source” contact has been computed, you must change the input file and carry out a similar computation for the “Drain” contact.

Task 2: SCC Calculation of the device in equilibrium

The SCC calculation of transport usually starts with an equilibrium calculation of the system with open-boundary conditions.

The Transport section must be modified:

Transport {
  Device {
    AtomRange = 1 95
    FirstLayerAtoms = 1, 21, 43, 54
  }
  Contact {
    Id = "Source"
    AtomRange = 96 143
    PLShiftTolerance = 0.01
    FermiLevel [eV] = -4.665975
    Potential [eV] = 0.0
  }
  Contact {
    Id = "Drain"
    AtomRange = 144 191
    PLShiftTolerance = 0.01
    FermiLevel [eV] = -4.665975
    Potential [eV] = 0.0
  }
}

Here it is important to note the value of FermiLevel which is taken from the contact calculations as reported in the files shiftcont_source/drain.dat. In case of identical contacts the Fermi levels will both be exactly the same. In this case the two contacts are sightly different resulting in a tiny difference of Fermi levels. Given the very small difference and in order to avoid complications that will be discussed in another tutorial, we have forced both Fermi levels to an averaged value.

The contact potentials are set to 0.0 in order to start from an equilibrium calculation.

The keyword FirstLayerAtoms is used for the definition of the principal layers in the extended molecule of the device region. As the name of the keyword suggests, the principal layers are defined by specifying the first atom of each layer as described above (Task 0).

Poisson solver options

For the transport calculation the default GammaFunctional solver is substituted with the Poisson solver:

Electrostatics = Poisson {
  MinimalGrid [Angstrom] = 0.4 0.4 0.4
  AtomDensityTolerance = 1e-5
  CutoffCheck = Yes
  BuildBulkPotential = No
  SavePotential = Yes
  PoissonAccuracy = 1e-5
}

The tag Poisson is used to define the size of the Poisson box domain. The Poisson equation is solved via a real-space multigrid solver that employs finite-differences for discretisation on a finite box with a regular grid (structured mesh). The charge density on the right-hand-side is constructed exactly as in standard gamma-functional of DFTB, namely expanding the charge density into spherical s-like atomic densities weighted by atomic Mulliken charges.

The real-space box size is obtained in this example from the supercell definition. The box length along the transport direction is constrained by the position of the contacts and is internally adjusted.

The Poisson equation is solved with the following boundary conditions (BC):

  1. Dirichelet or mixed BC on the faces containing contacts.
  2. Neumann BC on the remaining faces.

The meaning of the additional options for the Poisson tag are the following:

MinimalGrid = 0.4 0.4 0.4
is used to specify that the grid must be spaced at less than 0.4 Ang. in all dimensions. The actual grid is adjusted internally since the number of grid points in every direction must be a power of 2 (exactly \(N = 2^n + 1\)).
AtomDensityTolerance = 1e-5

is used to specify, where the exponential decaying spherical s-like atomic charge densities should be cut off. Specifying a certain value here makes sure that all atoms contributing a density higher than the given value are considered when calculating the amount of charge at a certain point (default: 1e-5). The appropriate cutoff radius for that tolerance is calculated automatically by the code (and reported in the output). It is determined by finding the cutoff distance for each atom (\(\alpha\)), where the s-like charge density

\[n_\alpha(r) = \frac{\tau_{\alpha}^3}{8 \pi} e^{-\tau_{\alpha}r}\]

becomes smaller than the given tolerance. Then the maximal cutoff found for all atom types in the system is used. The quantity \(\tau_{\alpha}= \frac{16}{5} U_{\alpha}\) is the relationship between the extinction coefficient and the Hubbard parameter in atomic units (see [EPJE1998]).

In order to have a consistent calculation, the determined cutoff length must be smaller than the width of the principal layers when doing a calculation with contacts. The program will check for this criterion and stop if it is not fulfilled. In this example we set it exactly to the default value, only for clarification purposes.

CutoffCheck
if set to No, the code omits the check at to whether the cutoff for the atomic densities (either determined by AtomDensityTolerance or directly set by AtomDensityCutoff) is larger than the widths of the principal layers in the contacts. Please note that a cutoff bigger than the width of any contact PL results in inconsistent calculation, so turning this check off is highly discouraged, unless you exactly know what you are doing. (for experts only)
BuildBulkPotential = No
is used to specify that at the device/contact interfaces the bulk potential must be imposed as BCs. The bulk potential is computed for an ideal contact (infinite wire, sheet or surface). Here we should remind you that a key assumption in transport calculations is that the contacts are in equilibrium and that the device/contact interfaces are sufficiently far from the device with enough extended molecule contact-like material such that bulk conditions are recovered. This means that the charge density and potential at this interface should smoothly join with the bulk values. Setting this flag to Yes is important whenever there is a charge redistribution within the contact atoms that has an effect on the bulk potential, like in structures of heteronuclear species (e.g. SiC, GaAs, ZnO, etc.). In the case of graphene there is no charge redistribution between atoms hence the contact potential is 0, so its calculation is not necessary.
Green’s function options

Here we discuss the most important parameters to be set in the calculation of the system Green’s functions:

Solver = GreensFunction {
  Delta [eV] = 1e-4
  ContourPoints = 30 40
  RealAxisStep [eV] = 0.025
  EnclosedPoles = 0
}

The Green’s function approach is used to compute the density matrix directly for the system and as such it does not solve the eigenproblem (no eigenvectors are computed, hence no band structure is available).

ContourPoints
is used to specify the number of quadrature points in the contour integration (see also DFTB+ manual for a description of the complex contour). The default 20 20 has been increased here in order to improve convergence.
Delta
defines a small imaginary number used in the computation of the Green’s function. The default is usually fine.
EnclosedPoles
is set to 0 when the temperature is low (T=0). For T>0 few poles (usually 3) needs to be included within the contour.

Task 3: Bias calculations

Contact potentials can be set in order to put the system under bias. In this example we have performed 3 calculations by setting different contact potentials, hence total biases of 0.0, 0.5 V and 1.0 V:

# A bias of 0.5 V between contacts
Contact {
  Id = "Source"
  Potential [eV] = -0.250
}
Contact {
  Id = "Drain"
  Potential [eV] = 0.250
}

Notice that we set a symmetric bias on the junction, but this is not necessary. Also notice that in order to set a potential in volts it is necessary to specify energy units of eV.

When the system is biased, the Green’s function calculation also requires points along the energy axis on the bias window.

RealAxisStep
Is used to specify the point sampling along the real axis. The value was set in order to have 20 points for the bias of 0.5 V and 40 points at 1.0 V. The default step is 1500 points/Hartree, corresponding to about 0.018 eV.
RealAxisPoints
Alternatively, it is possible to set directly the number of points using this tag.

Notice that at finite temperatures the real axis integration extends by a certain multiple of kT beyond the bias window. This is essentially due to the tail of the Fermi function. It is instead possible to set the cutoff directly in units of kT, using the keyword FermiCutoff.

Critical values and convergence issues

Open BC calculations are not at all trivial. Convergence can be slow and is usually slower than supercell or cluster calculations. This happens because charge fluctuations within the central region (remember this calculation is in the Grand Canonical Ensemble) makes convergence of the SCC loop harder. Whenever convergence issues are encountered the user should consider trying the following points

  1. Increase the number of ContourPoints, especially the second number.
  2. Decrease the Poisson MinimalGrid: values between 0.2 and 0.4 are usually fine.
  3. Decrease the BroydenMix parameter (MixingParameter = 0.05 or 0.02 can help).
  4. Compute with the Temperature > 0.

Finite temperature calculations can be quite useful when there are partially filled states, since these tend to oscillate wildly around the Fermi energy preventing convergence. The temperature can be progressively decreased by restarting calculations and reading previously computed charges (ReadInitialCharges=Yes).

Results: Analysis block

The output of the calculation is reported in the file detailed.out. The file contains the same information of traditional DFTB+ calculations. However, since the Green’s functions solver does not find eigenvalues and eigenvectors, some values are not shown.

In order to calculate the transmission probability between contacts or the density of states the Analysis block must be specified:

Analysis{
  TunnelingAndDos{
    verbosity = 0
    EnergyRange [eV] = -9.0 -2.0
    EnergyStep [eV] = 0.02
    Delta [eV] = 1e-4
  }
}

As a post SCC calculation, the transmission across the device is computed. This is accomplished by defining the TunnelingAndDos block. EnergyRange Defines the energy range for the transmission plot EnergyStep Defines the interval sampling step.

The code will output files like transmission.dat which contain the transmission as a two column dataset, this can then can be plotted for example with xmgrace.

Transmission

Transmission function for the graphene/molecule/graphene system.

Example: Local Currents

[Input: recipes/transport/local-currents/]

This example guides the calculation of local atom-to-atom currents. DFTB+ can compute atom-to-atom currents starting from the general expression for the layer to layer current,

\[J = \frac{2e}{\hbar} \text{Tr} \left[H_{L,L+1} \rho_{L+1,L} - S_{L,L+1} \varepsilon_{L+1,L} - h.c. \right]\]

where \(L\) and \(L+1\) are principal layers of the transport region, \(\rho\) is the density matrix, \(\varepsilon\) is the energy-weighted density matrix, and the trace operation is used to add up all contributions. This expression is exact and guarantees current conservation along the device. From the above it is possible to write down an expression for current contributions from atom \(i\) to atom \(j\) as,

\[J_{ij} = \frac{2e}{\hbar} \sum_{\mu \in i} \sum_{\nu \in j} \text{Im} \left[ H_{\mu \nu} \rho_{\nu \mu} - S_{\mu \nu} \varepsilon_{\nu \mu}\right]\]

It should be pointed out that a derivation of the formula above is possible by, for instance, considering the time derivative of atom-projected wavefunctions but that leads to issues if the basis is not orthogonal (the DFTB case). Some implementations of local currents perform a Lowdin transformation in order to obtain orthogonalised local states to start, but we prefer to avoid such transformations that imply diagonalisation of potentially large matrices, beside the fact that with semi-infinite contacts the Lowdin rotation requires some truncation. In any case the non-locality and degree of ambiguity is not fully solved since Lowdin orbitals are quite extended.

The formulation we give still produces some issues the user should be aware of. Consider for instance a current in a linear chain with second-neighbour tight-binding interaction, as shown in Figure 51

bond currents in a linear chain

Bond currents in a linear chain with 2nd neighbour interactions.

Surprisingly, in the limit of low biases the bond currents between nearest neighbour atoms is positive, whereas the contribution from 2nd neighbour atoms is in the opposite direction. In units of quanta of conductance, we obtain \(J = 1.2 \text{G}_0 V\) between nearest-neighbour atoms and \(J = -0.1 \text{G}_0 V\) between second-neighbour atoms. If we cut any surface across the chain and compute the total current we correctly obtain \(J = 1.0 \text{G}_0 V\), i.e. a single quanta of conductance, as expected. The right panel of the figure above shows what happens when we compute atomic currents as a summation over bond currents. Then, for each atomic site we have an outgoing current flux of \(J = 1.2 - 0.1 = 1.1 \text{G}_0 V\). In the steady-state this is balanced by an opposite ingoing flux. Yet the net outcome of the calculation is a wrong conductance! What is missing is the 2nd neighbour contributions bypassing the next neighbour atoms. Clearly in a 1-dimensional chain this problem can be easily corrected, however in higher dimensional systems the correction might not be trivial or possible. Consider for instance the case of graphene as shown in Figure 52,

bond currents in a graphene

Bond currents in a ideal graphene with 2nd neighbour interactions.

Similar to the linear chain, the 1st neighbour currents go along the bias direction, whereas the 2nd neighbour currents point upstream. Clearly in general the contribution of 2nd or higher neighbour bond currents on each atom are not simple to include. Notice that the very same conceptual problem would occur also when the Lowdin transform is performed.

In conclusion, the local-orbital approach to bond currents produces well-defined quantities on the connection graph that however do not define a vector field. Despite this it is still useful to look at them. In the following example we show the difference between atom current and bond currents. In the second case we only consider first neighbour contributions. Atom currents on the other hand are computed as a summation of all neighbour contributions and based on the discussion above, they tend to produce a slightly overestimated result.

The reader might think that the easy solution to the problem could be to use the atomic orbitals to project the bond currents onto a real-space grid in order to recover a vector field. Unfortunately, this procedure produces strong violations of current continuity unless a complete set of orbitals is used. However the usual atomic orbital basis sets are far from complete.

Nanoribbon example

In order to obtain the local currents the user has to set the following input block:

Solver = Greensfunction {
  localCurrents = Yes
}

The reason why local currents are specified in the Greensfunction block and not within the Analysis block (e.g., TunnelingAndDos), is because the evaluation of local currents requires the Density Matrix and Energy-Weighted density matrix, according to the equations above. Hence all options concerning the contour integral apply to these calculations, rather than the options concerning the transmission function. The structure considered is shown in Figure 53.

Nanoribbon between graphene contacts

Nanoribbon considered in this tutorial. Periodic BC are used along X.

It consist of a nanoribbon in between graphene contacts. Periodic Boundary Conditions have been applied along the x-axis. Dangling bonds have been saturated with hydrogen atoms. In order to discuss the more complex case of currents in periodic systems, we consider a graphene nanoribbon (GNR) with a diagonal orientation. The hydrogen atoms are found very important in order to obtain a converged SCC-loop. Mulliken charges compare very well with values obtained for a supercell calculation based on usual DFTB. In order to converge the SCC loop we had to set a value for the delta-parameter in the Green’s function definition larger than the default value:

Solver = Greensfunction {
  delta = 5e-4
  localCurrents = Yes
}

This might occasionally happen when the iterative decimation solver of the surface Green’s functions fails with inversion errors or unusually long calculations that ends in NaN results. In this tutorial converged charges are precomputed and read from charges.dat. Notice that the file is stored as formatted text, hence the following option is required:

Option{
  ReadChargesAsText = Yes
}

The user might experiment restarting the SCC loop, this should take about 40 SCC iterations to converge.

Local bond currents in a nanoribbon

Local bond currents in a nanoribbon between graphene contacts.

Figure 54, shows the bond currents. This figure is obtained as a post-processing of the code outputs using the small program flux.f90, which can be found in the folder tools/misc/transport/. DFTB+ produces the following output files:

supercell.xyz
lcurrents_001_u.dat
lcurrents_u.dat

The file supercell.xyz contains the input geometry with additional atoms of the neighbour cells that are used to calculate the bond directions and draw the arrows. Notice that only bond-currents going from the central cell towards the atoms in the periodic copies are computed. The figure can be made periodic by copy-paste of repeating cells. The second file contains the k-resolved local currents, where 001 here stands for the first k-point and “u” stands for up spin. The last file contains the summation over all k-components with corresponding weights.

Make sure the program flux.f90 has been compiled and is available. The figure above was obtained by issuing the following command:

>> flux supercell.xyz -b lcurrents_u.dat 3 -w 0.3 -f 1.0 > scr.jmol

Notice the value 3 used, representing the number of neighbours considered. Then it is possible to plot this with:

>> jmol supercell.xyz -s scr.jmol

The white background colour in jmol was obtained with the jmol command:

> background white
> wireframe 0

and the black arrows can be obtained using the -c black as last option to flux.

Local atomic currents in a nanoribbon

Local atomic currents in a nanoribbon between graphene contacts.

Similarly, it is possible to draw atomic currents shown in Figure 55, with the command:

>> flux supercell.xyz -a lcurrents_u.dat 24 -w 0.3 -f 1.0 > scr.jmol

Here we consider 24 neighbours for every atom to reach a converged result. The user can experiment by changing this value as well as well as the value used for bond currents.

The rendering of the local current as arrows in jmol is a little primitive. One difficulty is that a linear scale has a narrow window of values that can be rendered with visible arrows. Arrows representing atomic currents are adjusted in length. The option -f can be used as a global rescaling factor in order to adjust all lengths by a factor. The arrows representing bond currents are limited within the bonds. In this case, in order to emphasise different magnitudes arrow widths are used. These can be adjusted using the option -w. In most cases it might be necessary to edit flux.f90 in order to adjust the aesthetics of the rendering as desired.

Interfaces with other codes

In this section some of the methods for communication between DFTB+ and external software are discussed. This can be used to improve ease of use and allows you to invoke DFTB+ via software that you may be more more familiar with, such as the Atomic Simulation Environment (ASE) or to access DFTB+ functionality via your own code in languages like Python.

Use of interface communication also offers the possibility of expanding the applications and functionality of DFTB+. For example, DFTB+ can serve as an energy/force engine for an external driver which then could perform calculations like geometry optimisation, coupled quantum / molecular mechanics calculations or more advanced molecular dynamics methods at the DFTB level.

Atomic Simulation Environment - ASE

The Atomic Simulation Environment - ASE is a set of Python based tools and modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations (cf. ASE documentation).

Thanks to mutual support of the i-PI socket-communication and also file-IO by ASE and DFTB+, you can freely choose the most suitable type of communication for you. Further information can be found in the sections linked below.

Note: Before going through the following sections, please make sure that you have installed a working version of the ASE package. If you are wondering how to install ASE, please consult the appropriate documentation.

File-IO

For calculations without heavy file-IO, i.e. systems whose wallclock time is dominated by the electronic structure part of the calculation (due to extensive SCC cycles or large system size), the communication between DFTB+ and external software via file-IO may be suitable.

Calling DFTB+ via ASE

[Input: recipes/interfaces/ase/fileio/1_conjgrad/]

In order for ASE to find the DFTB+ executable and Slater-Koster files, environment variables must be set. Assuming the use of the BASH shell, this is done as follows (in general an adaptation to your specific computing environment will be necessary):

$ DFTB_PREFIX=~/slakos/mio-0-1/
$ DFTB_COMMAND=~/dftbplus/bin/dftb+

In this case the geometry optimization of a simple water molecule serves as an example:

3  C
 O H
     1    1    0.00000000000E+00  -0.10000000000E+01   0.00000000000E+00
     2    2    0.00000000000E+00   0.00000000000E+00   0.78306400000E+00
     3    2    0.00000000000E+00   0.00000000000E+00  -0.78306400000E+00

Following the necessary imports, the main script defines the path to the DFTB+ executable and the geometry .gen file containing the water molecule shown above. The main method then first reads in the geometry. Subsequently, a DFTB() calculator object with options that are crucial for the actual calculation is then instantiated. Here we perform a self-consistent calculation with the DFTB+ ConjugateGradient{} method as the driver.

The calculator gets attached to the system or geometry respectively and the calculation is started.

Finally, the convergent geometry as well as the forces and corresponding energy of the system can be read out directly by ASE:

from ase.io import read
from ase.calculators.dftb import Dftb


GEO_PATH = './H2O_cluster.gen'

def main():
    '''Main driver routine.'''
    system = read(GEO_PATH, format='gen')

    calc = Dftb(label='H2O_cluster', atoms=system,
                Hamiltonian_SCC='Yes',
                Hamiltonian_SCCTolerance='1.00E-010',
                Driver_='ConjugateGradient',
                Driver_MaxForceComponent='1.00E-008',
                Driver_MaxSteps=1000,
                Hamiltonian_MaxAngularMomentum_='',
                Hamiltonian_MaxAngularMomentum_O='"p"',
                Hamiltonian_MaxAngularMomentum_H='"s"')

    system.set_calculator(calc)
    calc.calculate(system)

    final = read('geo_end.gen')

    forces = system.get_forces()
    energy = system.get_potential_energy()

if __name__ == "__main__":
    main()

The script above causes ASE to create an input file (dftb_in.hsd) with the specified options and invokes DFTB+ in the corresponding directory.

Geometry Optimization by ASE

[Input: recipes/interfaces/ase/fileio/2_bfgs/]

Apart from the invocation of DFTB+ via file-IO, the use of DFTB+ as an energy/ force engine in conjunction with an external geometry driver is possible. To do so, again, set the required environment variables (see explanation above) and consider a water molecule of the following geometry:

3  C
 O H
     1    1    0.00000000000E+00  -0.10000000000E+01   0.00000000000E+00
     2    2    0.00000000000E+00   0.00000000000E+00   0.78306400000E+00
     3    2    0.00000000000E+00   0.00000000000E+00  -0.78306400000E+00

Similar to the previous section, the path to the .gen file containing the geometry is defined first, followed by the calculator. Subsequently, the driver of the geometry optimization is specified and where to write the trajectory and the logfile. In this case, BFGS() is used, representative of all the calculators provided by ASE:

from ase.io import read
from ase.optimize import BFGS
from ase.calculators.dftb import Dftb


GEO_PATH = './H2O_cluster.gen'

def main():
    '''Main driver routine.'''
    system = read(GEO_PATH, format='gen')

    system.set_calculator(Dftb(label='H2O_cluster', atoms=system,
                               Hamiltonian_SCC='Yes',
                               Hamiltonian_SCCTolerance=1.00E-010,
                               Hamiltonian_MaxAngularMomentum_='',
                               Hamiltonian_MaxAngularMomentum_O='"p"',
                               Hamiltonian_MaxAngularMomentum_H='"s"'))

    opt = BFGS(system, trajectory='opt.traj', logfile='opt.log')
    opt.run(fmax=1.00E-008)

    forces = system.get_forces()
    energy = system.get_potential_energy()

if __name__ == "__main__":
    main()

The script shown causes ASE to generate appropriate input files for each step of the geometry optimization. Note that this can lead to heavy file-IO and thus a significant increase in wallclock time, depending on the speed of the storage used. Additionally, the self-consistency is started from fresh for each structure, substantially increasing the number of SCC cycles. Therefore it is advisable to perform such calculations on a ramdisk or even better via Socket-Communication.

Socket-Communication

[Input: recipes/interfaces/ase/sockets/]

For calculations that heavily rely on file-IO, it is better to use communication between DFTB+ and ASE with the i-PI protocol. In these cases, the reduction of the wallclock time can be significant.

Note: At this time, only communication with ASE or i-PI has been tested. In general, however, other software which correctly implements the i-PI protocol should be supported.

Note

To enable socket-communication in DFTB+ the WITH_SOCKETS flag in the configuration file config.cmake must be set to TRUE, before starting the compilation process!

Geometry Optimization by ASE
Providing the input for DFTB+

To establish a connection via socket communication, DFTB+ is called with the appropriate Socket{} driver option:

Driver = Socket {
  File = "dftbplus"
  Protocol = i-PI {}
  MaxSteps = 1000
  Verbosity = 0
}

Which instructs DFTB+ to read geometry driving commands via a named temporary communication file (stored in /tmp/). In this example, due to a bug in the current ASE implementation of the i-PI protocol (see note below), the code is asked to perform up to 1000 geometry steps under control of the external interface then stop. For external driving codes that support the EXIT command in the protocol, instead DFTB+ can continue running until told to stop by using:

MaxSteps = -1

Due to the nature of the i-PI protocol, a dummy geometry must be provided, from which the number and species of the atoms, as well as unit cell vectors will be read. In general, the specification of arbitrary positions and cell vectors is possible. However, to avoid errors due to a different ordering of the atoms in the ASE-geometry and the species in the dummy-geometry, this example uses the ability of ASE to read in and write out .gen files. The output geometry (geo.gen) is then used as input for DFTB+, avoiding the possibility of the above mentioned errors:

Geometry = GenFormat {
  <<< "geo.gen"
}

In this example, calculation of a water molecule is performed without charge self-consistency and the required Slater-Koster files are stored in the top-level directory (same folder as dftb_in.hsd):

Hamiltonian = DFTB {
  SCC = No

  MaxAngularMomentum = {
    O = "p"
    H = "s"
  }

  SlaterKosterFiles = Type2FileNames {
    Prefix = "./
    Separator = "-"
    Suffix = ".skf"
  }
}

For the full input file assembled from the code snippets shown here, please consult the archive, whose location is stated at the begin of this section.

Geometry and main script

The geometry used is a simple water molecule, whose file location is specified in the main script (GEO_PATH):

3  C
 O H
     1    1    0.00000000000E+00  -0.10000000000E+01   0.00000000000E+00
     2    2    0.00000000000E+00   0.00000000000E+00   0.78306400000E+00
     3    2    0.00000000000E+00   0.00000000000E+00  -0.78306400000E+00

Following the necessary imports, the main script defines the type of socket to be used (UNIX socket) as well as the path to the DFTB+ executable (DFTBP_PATH) and the geometry (GEO_PATH). The main method then first reads in the geometry and immediately writes it out for the reasons mentioned above. Subsequently, the ASE driver BFGS() for geometry optimization is specified and where to write the trajectory and the logfile.

Finally, the convergent forces and the corresponding energy of the system can be read out directly by ASE:

import sys
from subprocess import Popen
from ase.io import read, write
from ase.optimize import BFGS
from ase.calculators.socketio import SocketIOCalculator


UNIXSOCKET = 'dftbplus'
DFTBP_PATH = ''
GEO_PATH = './H2O_cluster.gen'

def main():
    '''Main driver routine.'''

    system = read(GEO_PATH, format='gen')
    write('geo.gen', system, format='gen')

    opt = BFGS(system, trajectory='opt.traj', logfile='opt.log')

    with SocketIOCalculator(log=sys.stdout, unixsocket=UNIXSOCKET) as calc:
        Popen(DFTBP_PATH)
        system.set_calculator(calc)
        opt.run(fmax=1.00E-09)

    forces = system.get_forces()
    energy = system.get_potential_energy()

if __name__ == "__main__":
    main()

Note

To correctly close sockets on the ASE side, call calc.close() at the end or, more elegantly, enclose the class SocketIOCalculator using the with statement as done in the example shown here. Nevertheless, in the current state of ASE, the socket gets closed without warning missing the ‘EXIT’ string of the i-PI protocol, which always leads to an error message issued by DFTB+ at the end of a calculation driven by socket-communication.

Nudged Elastic Band - NEB

The nudged elastic band (NEB) method is used to obtain saddle points as well as minimum energy paths between known initial and final states. Intermediate images are introduced (by interpolation), for which local energy minima are calculated under the constraint of equal spacing between neighboring images (except for the climbing image modification). For a detailed description of all the features provided by ASE’s NEB class please consult the corresponding documentation (NEB class).

The umbrella inversion of a simple NH3 (ammonia) molecule serves as a prominent example for NEB energy barrier calculations.

NEB via File-IO

[Input: recipes/interfaces/ase/neb/fileio/]

In order for ASE to find the DFTB+ executable and Slater-Koster files, environment variables must be set. Assuming the use of the BASH shell, this is done as follows (in general an adaptation to your specific computing environment will be necessary):

$ DFTB_PREFIX=~/slakos/mio-0-1/
$ DFTB_COMMAND=~/dftbplus/bin/dftb+
Geometry

To calculate the energy barrier of the umbrella inversion process, it is necessary to provide geometries for the initial and final state.

A possible choice for the .gen files defining the initial and final state of the calculation could be the following:

initial state:

  4  C
N  H
  1 1    0.3282523436E+00   -0.6075947736E+00    0.1062852063E+00
  2 2    0.1193013364E+01   -0.3265926020E+00    0.5715962197E+00
  3 2    0.2770647466E+00   -0.1801902091E+00   -0.8199941035E+00
  4 2   -0.4767004541E+00   -0.3215324152E+00    0.6662026774E+00

final state:

  4  C
N  H
  1 1    0.3206429517E+00   -0.6319775238E+00    0.1527307910E+00
  2 2    0.1179077740E+01   -0.1010702015E+01    0.5563961495E+00
  3 2    0.2628891413E+00   -0.8794981362E+00   -0.8365549890E+00
  4 2   -0.4905898328E+00   -0.1001302325E+01    0.6515180485E+00

Figure 56 visualizes the defined geometries. The NH3 molecule passes through a planar transition state, thus inverting the pyramidal shape:

Initial a) and final b) state of the |NH3| (ammonia) umbrella inversion.

Initial a) and final b) state of the NH3 (ammonia) umbrella inversion.

Main script

The Python script shown below is a suitable and simple implementation to run a calculation with NIMAGES as the number of intermediate images (in this case set to 13).

The main method then first reads in the geometry and immediately writes it out for the reasons mentioned in Socket-Comm.: Geometry Optimization by ASE. The trajectories (obtained from previous calculations) of the initial and final state are read in and a list with the individual images is created, whereby the first and last image is determined by the states that have just been read in. True, independent copies of the first image are created for the 13 intermediate images. A NEB object is instantiated and a linear interpolation of the path from the intial and final state is calculated.

Subsequently, the ASE driver BFGS() for geometry optimization is specified and where to write the resulting trajectory (i2f.traj). A list of instantiated DFTB calculators is created and its elements attached to the single intermediate images. Finally the calculation is started by calling the run command, while passing the maximum force component fmax (mind the ASE Units!):

from ase.io import read, write
from ase.neb import NEB
from ase.optimize import BFGS
from ase.calculators.dftb import Dftb

NIMAGES = 13

def main():
    '''Main driver routine.'''

    initial = read('NH3_initial.traj')
    final = read('NH3_final.traj')

    images = [initial]
    images += [initial.copy() for ii in range(NIMAGES)]
    images += [final]

    neb = NEB(images)
    neb.interpolate()

    opt = BFGS(neb, trajectory='i2f.traj')

    calcs = [Dftb(label='NH3_inversion',
                  Hamiltonian_SCC='Yes',
                  Hamiltonian_SCCTolerance='1.00E-06',
                  Hamiltonian_MaxAngularMomentum_N='"p"',
                  Hamiltonian_MaxAngularMomentum_H='"s"')
             for ii in range(NIMAGES)]

    for ii, calc in enumerate(calcs):
        images[ii + 1].set_calculator(calc)

    opt.run(fmax=1.00E-02)


if __name__ == "__main__":
    main()
Analysis

The results of the NEB calculation can be extracted out of the trajectory file i2f.traj using corresponding ASE tools. If only the last 15 images (intermediate images plus initial and final state) are of interest, a construction like the following one would be suitable:

ase gui i2f.traj@-15:

Since ASE is capable of quickly visualizing the reaction path via the Tools ‣ NEB tab, you should obtain something similar to Figure 57:

Energy barrier of the umbrella inversion of a single |NH3| (ammonia) molecule.

Energy barrier of the umbrella inversion of a single NH3 (ammonia) molecule.

Since the initial and final state are identical after a suitable unitary transformation, the energetic difference \(\Delta E\) between the states obviously vanishes.

NEB via Socket-Communication

[Input: recipes/interfaces/ase/neb/sockets/]

Note

To enable socket-communication in DFTB+ the WITH_SOCKETS flag in the configuration file config.cmake must be set to TRUE, before starting the compilation process!

Providing the input for DFTB+

The composition of the input is analogous to the explanations in Socket-Communication. DFTB+ is invoked with the Socket{} driver, along the usual specifications for the Slater-Koster files, assuming that they are stored in the top-level directory of the calculation (same folder as dftb_in.hsd). The name dftbplus of the temporary file, which is required for communication via the i-PI protocol, initially only serves as a placeholder and is later replaced by the Python script used to run the NEB calculation. The extended HSD format allows to override properties (cf. Geometry) that were set earlier and is well suited for this use case. This effort is being made to ensure that each image has its own calculator attached:

Geometry = GenFormat {
    <<< "../../dummy.gen"
}

Driver = Socket {
    File = "dftbplus"
    Protocol = i-PI {}
    MaxSteps = -1
    Verbosity = 10
}

Hamiltonian = DFTB {
  SCC = Yes
  SCCTolerance = 1.0e-06

  MaxAngularMomentum = {
    H = "s"
    N = "p"
  }

  SlaterKosterFiles = Type2FileNames {
    Prefix = "../../slakos/"
    Separator = "-"
    Suffix = ".skf"
  }
}

ParserOptions {
  ParserVersion = 8
}
Geometry

For the geometries used as the initial and final state, please refer to section NEB (File-IO): Geometry.

Main script

In many respects, the script at hand is similar to that for File-IO calculations (cf. NEB (File-IO): Main script). Nevertheless, a slight modification is necessary:

Performing the NEB calculation requires setting up a folder structure so that the calculation of an image is carried out in a separate directory and is assigned to its own ASE calculator, while communicating via a separate UNIX socket or temporary file respectively. The Python script shown below is a suitable and simple implementation to run a calculation with NIMAGES as the number of intermediate images (in this case set to 13).

DFTBP_PATH specifies the path to the DFTB+ executable and HSD_INPUT the path to the input file discussed above (cf. Providing the input for DFTB+).

A list with the working directories for the DFTB+ calculations is then created and the input file is read. Method write_modhsd creates the separate subfolders for the different DFTB+ image calculations and copies the input file into the created directories. Using the capabilities of the extended HSD format, the filename dftbplus of the temporary communication file gets overwritten according to the corresponding image number.

A list of instantiated socket calculators is created and its elements attached to the single intermediate images. Sockets and clients are invoked and the calculators are closed after the calculations have been carried out:

import os
from subprocess import Popen
from ase.io import read, write
from ase.neb import NEB
from ase.optimize import BFGS
from ase.calculators.socketio import SocketIOCalculator


NIMAGES = 13

DFTBP_PATH = 'dftb+'
HSD_INPUT = 'dftb_in.hsd'
GEO_PATH = 'NH3_initial.gen'


def main():
    '''Main driver routine.'''

    system = read(GEO_PATH, format='gen')
    write('dummy.gen', system, format='gen')

    initial = read('NH3_initial.traj')
    final = read('NH3_final.traj')

    images = [initial]
    images += [initial.copy() for ii in range(NIMAGES)]
    images += [final]

    neb = NEB(images)
    neb.interpolate()

    opt = BFGS(neb, trajectory='i2f.traj')

    socketids = range(1, NIMAGES + 1)
    wdirs = ['_calc/image_{:d}'.format(socket) for socket in socketids]
    unixsockets = ['dftbplus_{:d}'.format(socket) for socket in socketids]

    write_modhsd(socketids)

    calcs = [SocketIOCalculator(log='socket.log', unixsocket=unixsocket)
             for unixsocket in unixsockets]

    for ii, calc in enumerate(calcs):
        images[ii + 1].set_calculator(calc)

    for cwd in wdirs:
        Popen(DFTBP_PATH, cwd=cwd)

    opt.run(fmax=1.00E-02)

    for calc in calcs:
        calc.close()


def write_modhsd(socketids):
    '''Writes input files based on 'dftb_in.hsd' with modified unixsockets.

    Args:
        socketids (1darray): contains unixsocket identifier
    '''

    for socket in socketids:
        path = '_calc/image_{:d}'.format(socket)
        os.makedirs(path, exist_ok=True)

        with open(path + '/dftb_in.hsd', 'w') as file:
            driver = '  <<+ ../../dftb_in.hsd' + '\n\n' + \
            '+Driver = +Socket ' + '{\n' + \
            '    !File = "dftbplus_{:d}"'.format(socket) + '\n}'
            file.write(driver)


if __name__ == "__main__":
    main()
Analysis

For a short introduction to the evaluation of the results, please consult the relevant section NEB (File-IO): Analysis.

Python Interface

The Python interface enables DFTB calculations to be performed by external programs. Results such as system energy, atomic forces and Mulliken charges can be extracted directly, along with performing tasks like defining calculations in external potentials or updating the geometry. With the pythonapi (version 0.1) an existing input file (dftb_in.hsd), which contains settings in addition to the geometry, is required for the initialization of DFTB+.

Regarding the units: Just like DFTB+, the interface expects and delivers values in atomic units!

Setting up DFTB+

For this special use case, DFTB+ needs to be compiled as a shared library with API support enabled. At this point, a basic understanding of how to build DFTB+ is assumed (all necessary steps are explained in the INSTALL.rst file in the top level directory of the DFTB+ repository). The CMake configuration for this case can be done by setting the WITH_API and BUILD_SHARED_LIBS flags to be TRUE in the file config.cmake, before starting the configuration and compilation process. Alternatively, if you do not want to modify files, a construct like the following is a convenient way to specify these flags on the command line while configuring with CMake:

cmake -DBUILD_SHARED_LIBS=1 -DWITH_API=1 ..

If only the pure compilation process is carried out, the resulting library is located inside the buid directory at prog/dftb+/. If make install was executed, there is also a copy placed in the CMAKE_INSTALL_PREFIX path, which defaults to install/lib/ inside the build directory. Within these locations, the library files libdftbplus.* will be installed.

The path to the resulting shared library must be passed to the interface, so you will need to know where the libdftbplus.* files ends up.

Setting up the interface

You can install the associated Python package via the standard ‘python setup’ mechanism. If you want to install it system-wide into your normal python installation, with the appropriate permissions you can simply issue python setup.py in the directory tools/pythonapi/. Alternatively, to install it locally in your home space, use python setup.py install --user. If the local Python installation directory is not in your PATH, you should add it accordingly.

Providing the input for DFTB+

[Input: recipes/interfaces/pyapi/]

To initialize an instance of the dftbplus calculator, you will need an initial file formatted according to the HSD structure for the usual dftb_in.hsd input of DFTB+.

Although the geometry of the structure to be calculated can be later replaced by your main script, it is necessary first to supply a dummy geometry (suitable for the desired number of atoms and boundary conditions). It is important to ensure that the total number of coordinates matches the number of atoms of the real geometry and that any specified lattice vectors are linearly independent, otherwise DFTB+ will return an error message. In the case of the supplied TiO2 example the following geometry block of dftb_in.hsd would be appropriate:

Geometry = GenFormat {

    6  S
 Ti  O
    1 1    0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
    2 1    0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
    3 2    0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
    4 2    0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
    5 2    0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
    6 2    0.0000000000E+00  0.0000000000E+00  0.0000000000E+00
    0.0000000000E+00    0.0000000000E+00    0.0000000000E+00
    1.0000000000E+02    0.0000000000E+00    0.0000000000E+00
    0.0000000000E+00    1.0000000000E+02    0.0000000000E+00
    0.0000000000E+00    0.0000000000E+00    1.0000000000E+02

}

The section Hamiltonian contains the usual entries for the calculation of a TiO2 supercell:

Hamiltonian = DFTB {

  Scc = Yes
  MaxSccIterations = 100
  SccTolerance = 1.00e-05

  SlaterKosterFiles = Type2FileNames {
    Prefix = "./slakos/mio-ext/"
    Separator = "-"
    Suffix = ".skf"
  }

  MaxAngularMomentum {
    Ti = "d"
    O  = "p"
  }

  KPointsAndWeights = SupercellFolding {
    4   0   0
    0   4   0
    0   0   4
    0.5 0.5 0.5
  }

}

To be able to also extract the forces via the interface, it is advisable to specify the corresponding entry in the Analysis{} block. Otherwise DFTB+ will issue an error message, asking the user to do so. To ensure backwards compatibility of the input, the parser version should also be specified:

Analysis {
  CalculateForces = Yes
}

ParserOptions {
  ParserVersion = 8
}

Main script

The script shown here serves to illustrate the use of the Python interface, based on the calculation of TiO2.

In order to be able to use the interface, the package dftbplus must be imported as the first step. The path LIB_PATH to the DFTB+ shared library is defined (note that the name prefix of the libray file name should also be part of the path), as well as conversion factors to convert the atom coordinates we will list in from Ångström into the atomic units (Bohr) required by the interface.

import numpy as np
import dftbplus


LIB_PATH = '/home/user/libdftbplus'

# DFTB+ conversion factors
# (according to prog/dftb+/lib_common/constants.F90)
BOHR__AA = 0.529177249
AA__BOHR = 1 / BOHR__AA

At the beginning of the main() function, the atom coordinates and lattice vectors are defined. In this case, a conversion to atomic units is necessary, since a .gen block is used whose values are usually in units of Ångström.

def main():
    '''Main driver routine.'''

    # coordinates of TiO2, in Ångström
    coords = np.array([
        [-0.016726922839251,  0.016725329441158, -0.000003204152532],
        [-0.016726505918979,  1.920201169305565, -7.297102897292027],
        [ 0.017412997824265, -0.024318617967798,  2.005339137853385],
        [ 1.920770753428742, -0.024319922392223, -4.437737763954652],
        [ 0.024319174400169, -0.017404302527510, -2.005347277168561],
        [ 0.024317270342179,  1.886164739806594, -5.291732430733527]])

    # lattice vectors of TiO2, in Ångström
    latvecs = np.array([
        [-1.903471721000000,  1.903471721000000,  4.864738245000000],
        [ 1.903471721000000, -1.903471721000000,  4.864738245000000],
        [ 1.903471721000000,  1.903471721000000, -4.864738245000000]])

    # conversion to atomic units
    coords *= AA__BOHR
    latvecs *= AA__BOHR

An object of the DftbPlus class is instantiated, which requires the location of the shared library libpath, the HSD input file hsdpath and the name of the log file logfile to be optionally specified. These keywords have default values ‘./libdftbplus’, ‘./dftb_in.hsd’ and None if not set explicitly. Note, that adding the shared library extension to libpath is not essential. Since the extension can be system dependent, it is guessed by the interface if missing. If logfile=None is specified, the output of the calculation gets printed to stdout.

After instantiation, the geometry can set or replaced; for periodic structures, lattice vectors can be specified in addition to the absolute coordinates.

The DFTB+ calculations are carried out automatically, as soon as the corresponding get_* methods are called. To correctly finalize the DFTB+ object, use the close() method.

    cdftb = dftbplus.DftbPlus(libpath=LIB_PATH,
                              hsdpath='dftb_in.hsd',
                              logfile='TiO2.log')

    # set geometry
    cdftb.set_geometry(coords, latvecs=latvecs)

    # get number of atoms
    natoms = cdftb.get_nr_atoms()

    # calculate energy, gradients and Gross charges
    merminen = cdftb.get_energy()
    gradients = cdftb.get_gradients()
    grosschg = cdftb.get_gross_charges()

    # finalize DFTB+ and clean up
    cdftb.close()


if __name__ == "__main__":
    main()

As always, please consult the archive to obtain the complete, connected script. To do so, follow the path mentioned above.

The Interface is also capable of defining a population (in)dependent external potential. This is covered in the following two sections (extpot, qdepextpot).

Using a population independent external potential

[Input: recipes/interfaces/pyapi/extpot/]

A external potential which does not depend on the Mulliken charges in the calculation can be included with only a small addition in the script. The DFTB+ object has a method set_external_potential(), which should be relatively self-explanatory. The external potential at the position of the QM-atoms is given as a positional argument. If forces are required, the gradient of the external potential at each atom can be passed additionally as the keyword argument extpotgrad.

Therefore, after the initialization of the DFTB+ object, the following code is inserted:

# example values of extpot and extpotgrad used here were
# taken from file: test/api/mm/testers/test_extpot.f90
extpot = np.array([-0.025850198503435,
                   -0.005996294763958,
                   -0.022919371690684])

extpotgrad = np.array([
    [0.035702717378527,  0.011677956375860, 0.009766745155626],
    [0.023243271928971, -0.000046945156575, 0.004850533043745],
    [0.016384005706180,  0.004608295375551, 0.005401080774962]])

# set external potential and its gradients
cdftb.set_external_potential(extpot, extpotgrad=extpotgrad)

Population dependent external potential

[Input: recipes/interfaces/pyapi/qdepextpot/]

This section deals with the capability of the interface to run calculations with a population dependent external potential, i.e. arrising in cases like polarizable surroundings where the applied field responds to the state of the QM calculation. Since in general only the user knows how to calculate this type of potential, callback functions can be defined which will then be executed at runtime.

The DFTB+ object provides a method register_ext_pot_generator() that takes care of the registration of the callback functions. As the first positional argument of this method, an arbitrary pointer can be specified. DFTB+ will pass back this pointer unaltered when calling the registered functions. You can typically use it to pass a pointer to the data or a Python object (class) which contains the necessary data for the potential calculation. If your data is in the global space and you do not need it, pass None (or equivalent). The second and third positional arguments have to be the function that provides the external potential and its gradients.

Furthermore, the auxiliary class PotentialCalculator is defined to perform the actual calculation of the external potential and its gradients. The structure of a script required for the calculation is explained below, using a trivial example in which the external potential and gradient are assumed to be zero. Therefore, this should not change the results of the calculation.

import numpy as np
import dftbplus


LIB_PATH = '/home/user/libdftbplus'


class PotentialCalculator:
    '''

       Auxiliary class for calculating the population dependent external
       potential and its gradients. An instance of this class gets handed over
       to DFTB+ via the ctypes interface, to handle the necessary callbacks.

    '''


    def __init__(self, qmcoords, mmcoords, mmcharges):
        '''Initializes a PotentialCalculator object.

        Args:

            qmcoords (2darray): coordinates of QM-atoms
                (shape: [qmatoms, 3])
            mmcoords (2darray): coordinates of MM-atoms
                (shape: [mmatoms, 3])
            mmcharges (1darray): charges of MM-atoms
                (shape: [mmatoms, 1])

        '''

        self._qmcoords = qmcoords
        self._mmcoords = mmcoords

        self._qmatoms = np.shape(self._qmcoords)[0]
        self._mmatoms = np.shape(self._mmcoords)[0]

        self._mmcharges = mmcharges


    def calc_extpot(self, dqatom):
        '''Calculates the current external potential
           using the properties of the MM- and QM-atoms.

        Args:

            dqatom (1darray): population difference with respect to
                reference population (usually the neutral atom)
                Note: population means electrons, so a
                positive number indicates electron excess

        Returns:

            extpot (1darray): updated external potential
                at the position of each QM-atom

        '''

        # Note: Some types of potential require knowledge of the
        # current atomic populations, which is provided by dqatom.

        extpot = np.zeros(self._qmatoms)

        return extpot


    def calc_extpotgrad(self, dqatom):
        '''Calculates the current gradients of the external
           potential using the properties of the MM- and QM-atoms.

        Args:

            dqatom (1darray): population difference with respect to
                reference population (usually the neutral atom)
                Note: population means electrons, so a ositive number
                indicates electron excess

        Returns:

            extpotgrad (2darray): updated potential gradient
                at the position of each QM-atom

        '''

        # Note: Some types of potential require knowledge of the
        # current atomic populations, which is provided by dqatom.

        extpotgrad = np.zeros((self._qmatoms, 3))

        return extpotgrad


def get_extpot(potcalc, dqatom, extpotatom):
    '''Queries the external potential.

    Args:

        potcalc (pyobject): instance of a class that provides methods for
            calculating the external potential and its gradients
        dqatom (1darray): population difference with respect to reference
            population (usually the neutral atom)
            Note: population means electrons, so a positive number indicates
            electron excess
        extpotatom (1darray): potential at the position of each QM-atom
            Note: it should be the potential as felt by an electron
            (negative potential value means attraction for an electron)

    '''

    extpotatom[:] = potcalc.calc_extpot(dqatom)


def get_extpotgrad(potcalc, dqatom, extpotatomgrad):
    '''Queries the external potentials gradients.

    Args:

        potcalc (pyobject): instance of a class that provides methods for
            calculating the external potential and its gradients
        dqatom (1darray): population difference with respect to referenc
            population (usually the neutral atom)
            Note: population means electrons, so a positive number indicates
            electron excess
        extpotatomgrad (2darray): potential gradient at the position of each
            QM-atom
            Note: it should be the gradient of the potential as felt by an
            electron (negative potential value means attraction for an
            electron)
    '''

    extpotatomgrad[:, :] = potcalc.calc_extpotgrad(dqatom)

The initialization of the calculator and the definition of the geometry is completely analogous to the above explanations. Only the registration of the callback functions is still missing:

# register callback functions for a qdepextpot calculation
cdftb.register_ext_pot_generator(potcalc, get_extpot, get_extpotgrad)

Please consult the associated archive with this tutorial to obtain the full corresponding example.

Licence

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Bibliography

[PPD2008]A. Pecchia, L. Salvucci, G. Penazzi and A. Di Carlo, Non-equilibrium Green’s functions in density functional tight-binding: methods and applications, New J. Phys. 10 065022 (2008).
[EPJE1998]M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, M. Haugk, T. Frauenheim, S. Suhai and G. Seifert,a Self-consistent-charge density-functional tight-binding method for simulations of complex materials, Phys. Rev. B 58, 7260 (1998).