Change log

- Cholesky orthonormalization for improved minimize and SCF performance compared to symmetric orthonormalization (especially on GPUs)
- New features:
- DFT-D3 dispersion correction
- Wannierized defect matrix elements using a perturbed supercell calculation
- Added option to sum over all atoms of species in orbital-projected DOS
- Checkpoint-restart capability and selection of a subset of momentum transfers in electron-scattering
- Finite-temperature support in electron-scattering
- Detect and use potential-only functionals from LibXC
- Use LibXC functionals without exchange and/or correlation
- Output list of commensurate k-points in phononHsub to new phononKpts output
- Landauer conductance included in FermiVelocity output

- Exact exchange enhancements:
- Acceleration using adiabatic compression of exchange operator (ACE)
- Band parallelization of exact exchange allowing use of many more processes than nStates.
- Added command exchange-params to control ACE convergence and band blocks in EXX calculation.

- BerkeleyGW interface:
- Added rALDA kernel option
- Added HDF5 flush for partial consistency of output during dense diagonalization
- Fix in RPA frequency grid to match BerkeleyGW

- Compilation:
- Support for CUDA 11, which required changes in linkage
- Added CMake options to select ScaLAPACK other than MKL

- Complete rewrite of ionic-dynamics with full support for NVT and NPT using either Nose-Hoover or Berendsen thermostats/barostats
- Revamped lattice minimization with analytical stress tensors, replacing previous finite difference implementation
- Split and reorganized the BGW output code into manageable chunks divided by core, fluid and dense diagonalization functionality
- New features:
- Methfessel-Paxton order 1 option in elec-smearing
- Fermi velocity and ballistic conductance dump options in dump
- Orbital-projected band weights output option in dump
- Coulomb matrix elements output in FCIdump format in dump
- Support for multiple hyperplane constraints in ionic optimization
- Support for sequence / animated xyz files in xyzToIonposOpt, generating several ionpos files with a common lattice
- Electric field matrix element output in wannier
- Electron-phonon matrix element acoustic sum-rule correction in the presence of external electric fields
- Long-range polar corrections for 2D phonons output by wannier
- Option to bypass electronic DFT with an LJ-potential (lj-override) for quickly testing ionic minimize / dynamics algorithms

- Exact exchange enhancements:
- Added support for exact-exchange in fixed-Hamiltonian mode, including for electronic-scf and converge-empty-states
- Cached real space wavefunctions in blocks (size set by exchange-block-size) to asymptotically halve number of FFTs
- New symmetrization logic that globally minimizes the number of k-point pairs using a Metropolis algorithm during initialization
- Added exact-exchange support in dense diagonalization for BGW output

- Bug fixes:
- Fixed Wannier finite-difference formula bug for highly anisotropic supercells (encountered primarily with large k-point sampling in 2D)
- Fixed LibXC support for range-separated hybrids
- Fixed DFT+U contribution to forces for cases with multiple atoms in species with U (introduced pre-1.0 when order of atomic orbitals was changed)
- Revamped dipole moments calculation to correctly account for embed center and periodicity
- Added separate coulomb object for tighter wavefunction grid if required for exact exchange or electron scattering.
- Check for dump-only issued without initial state in jdftx and phonon, with a graceful exit instead of a crash

- Documentation updates:
- Added tutorial demonstrating use of Electron-phonon matrix elements
- Added suite of tutorials on Defects calculations including in bulk, surface and 2D materials
- Updated Wannier tutorials to account for new MLWF format from 1.5 version

- Wannier enhancements:
- Condensed all Wannier outputs to unique-cell only, with separate cell weights to be used in post-processing, which substantially reduces memory and disk requirements of wannier tight-binding in large systems.
- Added wannierization of spin matrix elements in non-collinear mode
- Added support for polar corrections to e-ph matrix elements
- Updated e-ph matrix sum rule evaluation to use gradient matrix element
- Added wannierization of slab-weight matrix elements
- Improved translational invariance of wannier localization measure within supercell, particularly important for cases with few k-points.
- Load/save of overlap matrix elements (in *.mlwfM0 file) to speed up initialization of Wannier minimization
- Optimized performance of Wannier minimization substantially by reducing number of matrix multiplications and communication
- Reduced memory requirement for Wannier matrix-element output using block-wise Fourier transforms
- Refactored and split the wannierized-output code into smaller files for easier
- Removed rarely-used wrapWignerSeitz option to simplify code, and with it, removed mlwf versions of pure phonon properties

- Solvation features:
- Internal MPI wrapper changes:
- Added wrappers for all-to-one collective operators and asynchronous operators
- Added general interface for MPI operations on ManagedMemory and other containers
- Added support for multiplet datatypes including complex, vector3<> etc. in MPI wrapper

- Fixed bug in symmetry detection and density symmetrization in noncollinear magnetic calculations
- Revamped and optimized ASE interface to work with Python3, keep state for continuing, and to support pseudopotential sets
- Added option for including exchange-correlation effects in electron-electron scattering
- Optimized spherical harmonic templates, added analytic derivatives and used it to optimized [r,H] (momentum) matrix element evaluation
- Added band-structure unfolding output option
- Added option to control whether inversion symmetry is used in k-point symmetries (required for BerkeleyGW compatibility)
- Added environment variable input JDFTX_CPUS_PER_NODE to limit total number of cores used per node
- Added an install target to the CMake build system, correctly handling pseudopotential paths etc. from installed directory

- BerkeleyGW interface enhancements:
- ScaLAPACK diagonalizer for large-bands BerkeleyGW output, controlled by new command bgw-params
- Support for smearing in fixed-hamiltonian calculations such that fillings are updated at end, required for correct BerkeleyGW output for metals
- Added off-diagonal Vxc matrix elements output

- Eigenvalue over-ride options in density-of-states and wannier (to take input from other DFT or MBPT codes)
- Added option in command pcm-params to independently control ionic cavity mask
- Added external magnetic field support with command target-Bz (z-spin only)
- Added support for dense-matrix diagonalization and inversion on the GPU using the cuSolver library
- Added support for LibXC 4 (with continued support for incompatible LibXC 3 API)
- Added number of species and atoms report during initialization
- Improved supercell construction to generate diagonal super matrix whenever possible (in order to fix spurious "not a uniform k-mesh" errors)
- Added Fourier interpolation option in createVASP to generate high-resolution electron densities for Bader analysis

- Added newer versions of PBE and LDA GBRV pseudopotentials where available, and added PBEsol GBRV pseudopotentials for all elements
- Added links pointing to the latest version of each GBRV and SG15 pseudopotential (eg. GBRV/$ID_pbe.uspp and SG15/$ID_ONCV_PBE.upf)
- Added ultrasoft and spin-polarized support for electron-electron scattering
- Wannier enhancements:
- New syntax for wannier-center specification, with atomic orbitals selected by atom index rather than position
- Option spinMode in command wannier to optionally select Wannier function generation for specific spin channels
- Improved start-up time and memory requirement with ultrasoft pseudopotentials by transforming projections along with wavefunctions
- Meaningful error messages rather than stack trace for linearly-dependent trial orbitals

- Added t2g and eg orbital-set projection options to DOS
- Added option to restrict fluid cavity with a z-directed slab, controlled by command pcm-params
- Added normalization of atomic orbitals on radial grid to fix DFT+U for bad pseudopotentials (not a problem for GBRV, but previously an issue for some SG15 pseudopotentials)
- Updated internal normalization of SpeciesInfo::psiRadial to correspond more closely to normalized wavefunctions

- Support for ultrasoft pseudopotentials in phonon, Wannier, momentum matrix elements and e-ph matrix elements
- Phonon enhancements:
- Support for manual symmetries in phonon
- Switched phonon force matrix sum-rule / hermiticity corrections to k-space
- Added “Gamma-prime” correction to phonon force matrix that guarantees finite sound velocities regardless of numerical error
- Always outputs finite free energies neglecting imaginary components
- Detailed warning about imaginary frequencies including maximum magnitude and corresponding wave vector

- Fixed numerous phonon bugs:
- Mismatch between phase convention and cell weights (affected non-inversion-symmetric systems)
- Handling of Fcut for insulators with smearing
- Free energy report for 1x1x1 supercell (all components previously zero)
- Error in diagonal e-ph matrix elements (caused a minor indirect error; only through interpolation)
- Added work-around for bug in CUDA-aware OpenMPI (segfault when bcast'ing from GPU pointers)

- Wannier enhancements / bug-fixes:
- Added option wrapWignerSeitz to wannier which allows compactification of Wannierized Hamiltonian and matrix elements when atoms / centers are not in fundamental unit cell / WS cell
- Reduced memory requirement for Wannierization of electron-phonon matrix elements
- Fixed finite difference formula determination for highly asymmetric unit cells requiring large number of shells
- Restored support for numerical orbitals (broken since ~ 1.3.0)

- Enhancements to dump mechanism:
- Option $INPUT to automatically name outputs based on input file, and default dump-name is now $INPUT.$VAR (instead of $STAMP.$VAR)
- Capability to use different dump-name for each dump frequency
- Added dump frequency Init to dump a subset of outputs eg. symmetries, k-points etc. at the end of initialization, including in dry runs

- Interface to Berkeley GW: export wave functions, eigenvalues, XC matrix elements etc. for input to GW many-body perturbation theory calculations
- Implemented soft-sphere solvation model that uses cavity comprised of atomic spheres, with the extra option of a separate ionic cavity for electrochemistry
- Improved resolution in DOS output by updating Cspline to Lspline conversion
- Exposed DOS calculation using the tetrahedron method as a new class TetrahedralDOS, making it usable from external utilities
- Added support for multi-level MPI communicators: mpiWorld, mpiGroup etc. (placeholder for future multi-level MPI parallelization)
- Improved handling of marginal symmetries in atom positions: helpful error message suggesting command symmetry-threshold
- Fixed GridInfo mismatch bug in polarizability and electron scattering with embedded Coulomb truncation

- Added script createVASP to export structure and electron density in VASP's CHGCAR format
- Updated LatticeMinimizer to jointly optimize ionic positions, rather than in an inner loop
- Switched ManagedMemory to a templated version on arbitrary data types, and used it to handle all GPU/CPU allocations and data transfers automatically.
- Added an optional memory pool to ManagedMemory selected by environment variable JDFTX_MEMPOOL_SIZE, which is especially advantageous in multi-GPU jobs to avoid expensive cudaMallocs.
- Substantial improvements to GPU performance under CUDA 7 and newer by consolidating cudaGetDevice() and cudaGetDeviceProperties() calls (order of magnitude for small calculations)
- Added CMake option CudaAwareMPI to apply MPI operations directly on GPU pointers (default off)
- Added CMake option PinnedHostMemory to use page-locked host memory for GPU mode (default off)
- Reorganization of code between electronic and core, consolidating most non-wavefunction-specific structures and operators to core
- Reorganization of documentation with bibliography and better use of Doxygen modules

- Space group symmetrization instead of point groups alone. This now allows for calculations to be equally efficient regardless of where the structure is centered. This also substantially reduces the number of supercell calculations required by phonon for many systems. IMPORTANT: this may reduce number of k points for certain systems which will make the state incompatible from previous versions.
- Adjusted automatic FFT grid selection to be supercell-consistent i.e. minimum grid dimensions of unit cell, when multiplied by supercell count, will satisfy minimum grid dimensions for corresponding supercell at same plane-wave cutoff. IMPORTANT: default FFT boxes will change in some cases relative to 1.2.1 and earlier. This will only alter real-space grid outputs; wavefunctions remain unchanged.
- Switched to reciprocal-space symmetrization of scalar fields
- Changed supercell cell map determination to use Wannier centers / atom positions which makes Wannier tight-binding Hamiltonians and phonon force matrices strictly supercell-consistent and translation-invariant (which they previously weren't). Importantly, this allows Gamma-only Wannier and phonon calculations for large systems.
- Fixed numerous bugs in phonon code for cases with more than one atom per unit cell, eg. when symmetries of phonon supercell mapped atoms to ones in other unit cells, complex conjugate handling of non-inversion-symmetric cells etc.
- Support to split phonon calculations into individual perturbations that can be run separately (and in parallel) and collected together at the end. Added a summary output to phonon dry runs that facilitates planning the split calculations.
- Added phonon tutorials
- Added support for electric fields in periodic directions using low wave-vector perturbations. This enables bulk dielectric calculations using new command bulk-epsilon.
- Added support for anisotropic dielectric functions and surface defects without slab-mode Coulomb truncation in charged-defect-correction
- Added command exchange-parameters for controlling EXX scale factor and screening in hybrid functionals
- Added MPI and noncollinear support to exact exchange
- Added command symmetry-threshold to control symmetry detection sensitivity in input file
- Fixed phase of odd-l trial orbitals in Wannier (these were previously pure-imaginary instead of real)
- Fixed strange bugs in phonon and Wannier codes that resulted from unused pseudopotentials specified in input file
- Redefined subspace rotation factor to be consistent with the grand-canonical DFT paper, so that reasonable initial values are close to 1 (rather than 100 previously).
- Restored non-variational energy and gradient calculation for LinearPCM and SaLSA because it substantially improves the quality of the electrostatic potential, especially in large unit cells and low concentrations. Correspondingly, removed the option to run them with a Gummel iteration (introduced in 1.1.0). A related change substantially improves potentials output by NonlinearPCM when using pcm-nonlinear-scf.
- Updated LibXC support version to >=3, and added mechanism to auto-generate list of LibXC functionals by parsing its header file.
- Added a 1D (planarly-averaged for slabs) dielectric matrix output to electron-scattering
- Added support for BlueGene/Q systems
- Added a CMake option for automating static linking

- Added Gaussian and Cold smearing options in addition to the original Fermi fillings, and deprecated the old elec-fermi-fillings command in favor of elec-smearing.
- Added outer loop over fixed charge calculations (using secant method to optimize nElectrons) as an alternate fall-back method for fixed-potential calculations.
- Stabilized initialization of fixed-potential calculations using LCAO followed by an automatic initial vacuum calculation
- Switched to density mixing as the default for SCF.
- Improved stability of new Y-less minimizer against large steps by remembering subspace rotations for the duration of one line minimize.
- Fixed a bug in ionic minimize search-direction symmetrization introduced in 1.2.0
- Added a mechanism to gracefully deprecate commands: a placeholder for the old command issues a deprecation warning and replaces itself with the new syntax, so that the output file shows the new syntax and all other command code (dependencies / forbids etc.) only need to deal with the new command.

- Substantial memory optimizations, reducing typical usage by 30-50 % overall:
- Total energy minimize: switched from analytically-continued (Y and C) to orthonormal-only (C alone) algorithm, which eliminates one persistent copy of the wavefunctions
- SCF: reduced memory usage of the BandDavidson algorithm by eliminating cached copies of the overlap operator applied to the wavefunctions, and also of the conjugate-gradients BandMinimizer by switching away from the analytically -continued approach.
- Reduced peak memory usage further by improving move semantics and delaying / combining key ColumnBundle and matrix operations using template meta-programming.
- Added debug code options in ManagedMemory to stack-trace peak memory usage points

- Rewrote auxiliary Hamiltonian fillings algorithm with automatic preconditioner tuning, resulting in substantially better convergence, including at fixed potential. Auxiliary Hamiltonian is kept diagonal in new algorithm; the state for the new variable-fillings optimizer therefore uses eigenvals instead of Haux (deleted).
- Improved stability of ionic and lattice minimizers against occasional unphysically large steps.
- Updated WannierMinimizer to continuously update rotations in-place, analogous to the changes in ElecMinimizer and BandMinimizer, and switched to L-BFGS as the default optimizer (faster than CG after these changes).
- Removed: defunct fillings algorithms such as periodic mixing and mu controller, inverse Kohn-Sham minimizer and fix-occupied functionality
- Individual-frame output and Wannier orbital support in the createXSF script
- Expanded test set that covers salient features (loosely based on the new tutorial sequence)
- Complete set of tutorials for molecules, solids, surfaces and Wannier function calculations
- Detailed compilation instructions for Debian and Redhat-based GNU/Linux, MacOS X and Cygwin/Windows

- Added support for compiling and running on Windows using Cygwin
- Added option to link JDFTx statically
- New detailed tutorial set for molecular systems
- Energy correction for charged defects at surfaces in vacuum and solution
- Consistent behavior for binary file I/O on big-endian architectures such as PowerPC64 (eg. BlueGene)

- Moved auxiliary executables to aux making their compilation optional, and renamed testsuite to test
- Fixed compilation for new compilers: up to g++-5.3, Clang 3.8 and CUDA 7.5

- Improved convergence and fixed initialization errors in the SCF version of NonlinearPCM, and included softening of ion packing limit to eliminate NANs
- Timing information in all iterations (electrons, fluids, geometry etc.), and corresponding support in the plotConvergence script
- Added command to control whether to use Gummel loop or inner minimization for any fluid type. Correspondingly, energy evaluation in LinearPCM and SaLSA is now strictly variational (rather than generically first-order correct) to support Gummel iteration.
- Added command to control electrostatic potential output (raw or atom-potential corrected)
- Automatic initial vacuum electronic solve in fixed-mu fluid calculations remain neutral for consistency
- Renamed ion dynamics classes (not named specifically Verlet any more), and added confining potential options
- Added HyperPlane constraint to constrain combined motion of several ions and fixed transformation of linear constraint from lattice coordinates

First stable release of JDFTx, that marks the transition from SVN to Git code versioning. A detailed change log was unfortunately not maintained prior to this release, but here's a partial list of major developments since the first numbered release:

- MPI parallelization over k-points for electronic DFT, orientations for classical DFT and angular momenta for SaLSA
- Vibrations, phonons and improved Wannier support for
*ab initio*tight binding models (including electron-light and electron-phonon coupling) - Non-collinear magnetism and spin-orbit coupling, including support for DFT+U as well as phonons
- Support for electronic-SCF and the Davidson eigenvalue solver in addition to the default variational minimize using the Conjugate Gradients algorithm
- Iterative LCAO initial guess for wavefunctions (essentially a fully self-consistent DFT calculation in the basis of atomic orbitals)
- Substantial optimization of density augmentation code to speed up ultrafast pseudopotentials, and projector caching to optimize nonlocal potential evaluations
- Separate wavefunction and charge density cutoff, with an optionally tighter plane-wave grid for wavefunction transforms
- Systematized fluid specification: solvent and ionic fluid components, pcm-variant to control different parametrizations within classes of solvation models.
- Enhanced Coulomb truncation using double-sized boxes, including for fluids
- Refactored LinearPCM, NonlinearPCM and SaLSA code to use a common PCM base class
- Added the CANDLE, SCCS and SGA13 solvation models
- Polarizability support in Classical DFT fluids
- Charged-defect energy and slab dielectric function calculations; atom-potential correction for electrostatic potentials
- Preliminary support for
*ab initio*molecular dynamics - Added LibXC 2.0 support for additional exchange-correlation functionals
- Convenient wildcard pseudopotential specification, support for QE's UPF format and built-in GBRV and SG15 pseudopotential sets
- Scripts to convert XYZ files to JDFTx geometries, and to visualize JDFTx geometries and outputs via XSF support (XCrysDen, VESTA etc.)
- Reorganized tutorials, website and documentation, merging them all within Doxygen

First numbered release of JDFTx: a fully-functional plane-wave electronic DFT code with particular emphasis on solvation models, classical DFT and joint DFT.

Includes full support for GPU computing using CUDA, and efficient parallelization on multiprocessor machines using threads, but no support yet for cluster computing using MPI.