SOAPify.classify

Submodule to classify trajectories

Contains the definition of the container for the SOAPclassification and for the references container SOAPReferences. Along with the definition of function to apply a classification to a given dataset.

Functions

applyClassification(SOAPTrajData, ...[, ...])

Applies the references to a dataset.

createReferencesFromTrajectory(...[, ...])

Generate a SOAPReferences object.

getDistanceBetween(data, spectra, ...)

Generate an array with the distances between the the data and the given spectra

getDistancesFromRef(SOAPTrajData, ...[, ...])

generates the distances between a SOAP-hdf5 trajectory and the given references

getDistancesFromRefNormalized(SOAPTrajData, ...)

shortcut for SOAPify.classify.getDistancesFromRef() forcing normalization

getReferencesFromDataset(dataset)

Given a h5py.Dataset returns a SOAPReferences with the initializated data

mergeReferences(*x)

Merges a list of SOAPReferences into a single object

saveReferences(h5position, ...)

Export the given references in the indicated group/hdf5 file

Classes

SOAPReferences(names, spectra, lmax, nmax)

Stores the spectra selected for a environments dictionary.

SOAPclassification(distances, references, legend)

Stores the information about the SOAP classification of a trajectory.

class SOAPify.classify.SOAPReferences(names, spectra, lmax, nmax)[source]

Bases: object

Stores the spectra selected for a environments dictionary.

__init__(names, spectra, lmax, nmax)
lmax: int

the parameter lmax used in the calculation

names: list[str]

the names of the references

nmax: int

the parameter nmax used in the calculation

spectra: np.ndarray[np.float64]

the SOAP spectra of the references

class SOAPify.classify.SOAPclassification(distances, references, legend)[source]

Bases: object

Stores the information about the SOAP classification of a trajectory.

__init__(distances, references, legend)
distances: np.ndarray[float]

stores the (per frame) per atom information about the distance from the closest reference fingerprint

legend: list[str]

stores the references legend

references: np.ndarray[int]

stores the (per frame) per atom index of the closest reference

SOAPify.classify.applyClassification(SOAPTrajData, references, distanceCalculator, doNormalize=False)[source]

Applies the references to a dataset.

generates the distances from the given references and then classify all of the atoms by the closest element in the dictionary

Parameters
  • SOAPTrajData (h5py.Dataset) – the dataset containing the SOAP trajectory

  • references (SOAPReferences) – the contatiner of the references

  • distanceCalculator (Callable) – the function to calculate the distances

  • doNormalize (bool, optional) – informs the function if the given data needs to be normalized before caclulating the distance. Defaults to False.

Returns

The result of the classification

Return type

SOAPclassification

SOAPify.classify.createReferencesFromTrajectory(h5SOAPDataSet, addresses, lmax, nmax, doNormalize=True)[source]

Generate a SOAPReferences object.

by storing the data found from h5SOAPDataSet. The atoms are selected trough the addresses dictionary.

Parameters
  • h5SOAPDataSet (h5py.Dataset) – the dataset with the SOAP fingerprints

  • addresses (dict) – the dictionary with the names and the addresses of the fingerprints. The keys will be used as the names of the references and the values assigned to the keys must be tuples or similar with the number of the chosen frame and the atom number (for example dict(exaple=(framenum, atomID)))

  • doNormalize (bool, optional) – If True normalizes the SOAP vector before storing them. Defaults to True.

  • settingsUsedInDscribe (dscribeSettings|None, optional) – If none the SOAP vector are not preprocessed, if not none the SOAP vectors are decompressed, as dscribe omits the symmetric part of the spectra. Defaults to None.

Returns

the container with the selected references

Return type

SOAPReferences

SOAPify.classify.getDistanceBetween(data, spectra, distanceCalculator)[source]

Generate an array with the distances between the the data and the given spectra

TODO: enforce the np.ndarray

Parameters
  • data (np.ndarray) – the array of the data

  • spectra (np.ndarray) – the references

  • distanceCalculator (Callable) – the function to calculate the distances

Returns

the array of the distances (the shape is (data.shape[0], spectra.shape[0]))

Return type

np.ndarray

SOAPify.classify.getDistancesFromRef(SOAPTrajData, references, distanceCalculator, doNormalize=False)[source]

generates the distances between a SOAP-hdf5 trajectory and the given references

Parameters
  • SOAPTrajData (h5py.Dataset) – the dataset containing the SOAP trajectory

  • references (SOAPReferences) – the contatiner of the references

  • distanceCalculator (Callable) – the function to calculate the distances

  • doNormalize (bool, optional) – informs the function if the given data needs to be normalized before caclulating the distance. Defaults to False.

Returns

the “trajectory” of distance from the given references

Return type

np.ndarray

SOAPify.classify.getDistancesFromRefNormalized(SOAPTrajData, references)[source]

shortcut for SOAPify.classify.getDistancesFromRef() forcing normalization

see SOAPify.SOAPClassify.getDistancesFromRef(), the distance calculator is SOAPdistanceNormalized() and doNormalize is set to True

Parameters
  • SOAPTrajData (h5py.Dataset) – the dataset containing the SOAP trajectory

  • references (SOAPReferences) – the contatiner of the references

Returns

the trajectory of distance from the given references

Return type

np.ndarray

SOAPify.classify.getReferencesFromDataset(dataset)[source]

Given a h5py.Dataset returns a SOAPReferences with the initializated data

TODO: check if the dataset contains the needed references

Parameters

dataset (h5py.Dataset) – the dataset with the references

Returns

the prepared references container

Return type

SOAPReferences

SOAPify.classify.mergeReferences(*x)[source]

Merges a list of SOAPReferences into a single object

Raises

ValueError – if the lmax and the nmax of the references are not the same

Returns

a new SOAPReferences that contains the concatenated list of references

Return type

SOAPReferences

SOAPify.classify.saveReferences(h5position, targetDatasetName, refs)[source]

Export the given references in the indicated group/hdf5 file

Parameters
  • h5position (h5py.Group|h5py.File) – The file object of the group where to save the references

  • targetDatasetName (str) – the name to give to the list of references

  • refs (SOAPReferences) – the SOAPReferences object to be exported