API

Database

Contains all relevant classes and functions for constructing databases of various materials and clusters. This includes pulling and processing data from the Materials Project Database, as well as utilizing existing data the user may have on their hard drive.

class lightshow.database.Database(structures, metadata=None, supercells=None, supercell_cutoff=None, inequivalent_sites_initialized=False, supercells_initialized=False)

Bases: MSONable

Contains all materials and metadata for some database.

property database_status

A dictionary containing the current status of the database. Basically everything except the structures, metadata and supercells.

Return type:

dict

classmethod from_files(root, filename='CONTCAR', pbar=True)

Searches for files matching the provided filename, which can include wildcards, and assumes those files are structural files in a format that can be processed by Structure.from_file. Each structure is given its own index, with the origin path stored in its metadata.

{
    "0": struct1,
    "1": struct2,
    ...
}
Parameters:
  • root (str) – The directory in which to begin the search.

  • filename (str, optional) – The files to search for. Uses rglob to recursively find any files matching filename within the provided directory.

  • pbar (bool, optional) – If True, will show a tqdm progress bar.

Return type:

Database

classmethod from_files_molecule(root, filename='*.xyz', lattice=None, pbar=True)

Searches for files matching the provided filename, and assumes those files are structural files in a format compatible with Molecule.from_file.

Parameters:
  • root (str) – The directory in which to begin the search.

  • filename (str, optional) – The files to search for. Uses rglob to recursively find any files matching filename within the provided directory.

  • lattice (list of floats, optional) – Lattice parameter used to construct the crystal lattice. If not provided, defaults to [20.0, 20.0, 20.0] Angstroms.

  • pbar (bool, optional) – If True, will show a tqdm progress bar.

Return type:

Database

classmethod from_materials_project(api_key=None, method=None, **kwargs)

Constructs the Database object by pulling structures and metadata directly from the Materials Project. This is a simple passthrough method which utilizes the MPRester.materials.search API of the Materials Project v2 API.

Parameters:
  • api_key (None, optional) – API key which can either be provided directly or is read from the MP_API_KEY environment variable.

  • method (None, optional, str) – Keyword to get different information about materials’ for e.g. ‘thermo’, ‘xas’, ‘summary’ etc. fetch information on thermodynamic properties, computed XAS data, large amount of amalgated data about the material, respectively. See https://api.materialsproject.org/docs for more details.

  • **kwargs – Description

Return type:

Database

initialize_inequivalent_sites()

Iterates through the structures and updates the metadata with keys corresponding to the inequivalent sites in the structure. This also tracks the atom types corresponding to the inequivalent sites and their multiplicities in the structure.

initialize_supercells(supercell_cutoff=9.0)

Initializes the supercells from the structures pulled from the Materials project.

Parameters:

supercell_cutoff (float, optional) – Parameter used for constructing the supercells. Default is 9 Angstroms. TODO: this needs clearer documentation.

property metadata

A dictionary of metadata information about the structure. This information is filled by the Materials Project when data is pulled, and might be empty if there is no associated metadata (in cases of using data already on disk, for example).

Return type:

dict

property structures

A dictionary of pymatgen.core.structure.Structure objects. Contains the primitive structures. The keys are the IDs of the structure. In the case of data pulled from the Materials Project, these are the MPIDs. Otherwise, they are simply strings encoding some information about the origins of the structures.

Return type:

dict

property supercells

A dictionary of pymatgen.core.structure.Structure objects containing supercells and the same keys as structures.

Return type:

dict

write(root, absorbing_atoms=None, options=[], pbar=True, copy_script=None, write_unit_cells=True, write_multiplicity=True)

The core method of the Database class. This method will write all input files specified in the options parameter to disk. Of particular note is the directory structure, which is always consistent regardless of the type of calculation: At the first level is the key (usually an mpid) indexing the material of interest. At the next level is the user-specified “name” of the calculation, which are the first elements of each tuple in the options. Next are the material-specific calculations. For example, in spectroscopy calculations, each absorbing atom will have it’s own directory at this level. Within each of those are the input files for that calculation. For example:

mp-390
    VASP
        000_Ti
        SCF
    FEFF-XANES
        000_Ti
            # ... (input files)
mvc-11115
    # ...

Note

At the end of writing the files, a writer_metadata.json file will be saved along with the directories containing the input files. This metadata file contains all information about the parameters used to construct the input files, including those passed as arguments to this method.

Parameters:
  • root (os.PathLike) – The target root directory to save all of the input files.

  • absorbing_atoms (str or list, optional) – The absorbing atom type symbol(s), e.g. "Ti". Note that if None, any calculations in which the absorbing atom is required (e.g. all spectroscopy) will be skipped. Only calculations that do not require absorbing atoms to be specified (e.g. neutral potential VASP electronic structure self-consistent procedure) will be performed. Note this can also be "all", in which case, every atom in the structure will have input files written for it.

  • options (list, optional) – A list of lightshow.parameters._base._BaseParameters objects or derived instances. The choice of options not only specifies which calculations to setup, each of the options also contains the complete set of parameters necessary to characterize each individual set of input files (for e.g. FEFF, VASP, etc.).

  • pbar (bool, optional) – If True, enables the tqdm progress bar.

  • copy_script (os.PathLike) – If not None, will copy the script in the provided path to each of the input file locations.

  • write_unit_cells (bool, optional) – If True, writes the unit cells in the materials directory in POSCAR format. Very useful!

  • write_multiplicity (bool, optional) – If True, writes the multiplicities of each atom in the unit cell to a multiplicities.json file.

Parameters

Parameters - this is a test

Pymatgen Utilities

lightshow.pymatgen_utils.atom_in_structure(atom_symbol, structure)

Checks the provided structure to see if the atom_symbol is present in it.

Parameters:
  • atom_symbol (str)

  • structure (pymatgen.core.structure.Structure)

Return type:

bool

lightshow.pymatgen_utils.get_inequivalent_site_info(structure)

Gets the symmetrically inequivalent sites as found by the SpacegroupAnalyzer class from Pymatgen.

Parameters:

structure (pymatgen.core.structure.Structure) – The Pymatgen structure of interest.

Returns:

A dictionary containing three lists, one of the inequivalent sites, one for the atom types they correspond to and the last for the multiplicity.

Return type:

dict

lightshow.pymatgen_utils.get_supercell_indexes_matching_primitive(prim, sc, compare, r)
lightshow.pymatgen_utils.make_supercell(prim, cutoff=9.0)

Used to generate supercell with desired lattice vector: default cutoff 9 Angstrom, which is am empirical value according to our test. It will choose either to use primitive unit cell or to use conventional unit cell depending on which has less atoms.

Parameters:
  • prim (pymatgen.core.structure.Structure) – The primitive structure.

  • cutoff (float, optional) – The supercell cutoff.

Return type:

pymatgen.core.structure.Structure