-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring the data module #11
Labels
Refactor
code refactoring
Comments
SharpLonde
added a commit
to SharpLonde/DeePTB
that referenced
this issue
Dec 4, 2023
* Prototype code for loading Hamiltonian * add 'ABACUSDataset' in data module * modified "basis.dat" storage & can load overlap * recover some original dataset settings * add ABACUSDataset in init * Add the in memory version of ABACUSDataset * add ABACUSInMemoryDataset in data package
QG-phy
added a commit
that referenced
this issue
Feb 2, 2024
* add data * adapt data and nn module of nequip into deeptb * just modify some imports * update torch-geometry * add kpoint eigenvalue support * add support for nested tensor * update * update data and add batchlize hamiltonian * update se3 rotation * update test * update * debug e3 * update hamileig * delete nequip nn and write our own based on PyG * update nn * nn refactor, write hamiltonian and hop function * update sk hamiltonian and onsite function * refactor sktb and add register for descriptor * update param prototype and dptb * refactor index mapping to data transform * debug sktb and e3tb module * finish debuging sk and e3 * update data interfaces * update r2k and transform * remove dash line in file names * fnishied debugging deeptb module * finish debugging hr2hk * update overlap support * update base trainer and example quantities * update build model * update trainer * update pyproject.toml dependencies * update bond reduction and self-interaction * debug nnsk * nnsk run succeed, add from v1 json model * add nnsk test example of AlAs coupond system * Add 'ABACUSDataset' in data module (#9) * Prototype code for loading Hamiltonian * add 'ABACUSDataset' in data module * modified "basis.dat" storage & can load overlap * recover some original dataset settings * add ABACUSDataset in init * debug new dptb and trainer * debug datasets * pass cmd line train mod to new model and data * add some comments in neighbor_list_and_relative_vec. * add overlap fitting support * update baseline descriptor and debug validationer * update e3deeph module * update deephe3 module * Added ABACUSInMemoryDataset in data module (#11) * Prototype code for loading Hamiltonian * add 'ABACUSDataset' in data module * modified "basis.dat" storage & can load overlap * recover some original dataset settings * add ABACUSDataset in init * Add the in memory version of ABACUSDataset * add ABACUSInMemoryDataset in data package * update dataset and add deephdataset * gpu support and debugging * add dptb+nnsk mix model, debugging build, restart * align run.py, test.py, main.py * debugging * final * add new model backbone on allegro * add new e3 embeding and lr schedular * Added `DefaultDataset` (#12) * Prototype code for loading Hamiltonian * add 'ABACUSDataset' in data module * modified "basis.dat" storage & can load overlap * recover some original dataset settings * add ABACUSDataset in init * Add the in memory version of ABACUSDataset * add ABACUSInMemoryDataset in data package * Added `DefaultDataset` and unified `ABACUSDataset` * improved DefaultDataset & add `dptb data` entrypoint for preprocess * update `build_dataset` * aggregating new data class * debug plugin savor and support atom specific cutoffs * refactor bond reduction and rme parameterization * add E3 fitting analysis and E3 rescale * update LossAnalysis and e3baseline model * update band calc and debug nnsk add orbitals * update datatype switch * Unified dataset IO (#13) * Prototype code for loading Hamiltonian * add 'ABACUSDataset' in data module * modified "basis.dat" storage & can load overlap * recover some original dataset settings * add ABACUSDataset in init * Add the in memory version of ABACUSDataset * add ABACUSInMemoryDataset in data package * Added `DefaultDataset` and unified `ABACUSDataset` * improved DefaultDataset & add `dptb data` entrypoint for preprocess * update `build_dataset` * update `data` entrypoint * Unified dataset IO & added ASE trajectory support * Add support to save `.pth` files with different `info.json` settings. * Bug fix in dealing with "ase" info. * updated `argcheck` for setinfo. * added setinfo check when building dataset. * file IO improvements * bug fix in loading `info.json` * update e3 descriptor and OrbitalMapper * Bug fix in reading trajectory data (#15) * add comment and complete eig loss * update new embedding and dependencies * New version of `E3statistics` (#17) * new version of `E3statistics` function added in DefaultDataset. * fix bug in dealing with scalars in `E3statistics` * add "decay" option in E3statistics to return edge length dependence * fix bug in getting rmes when doing stat & update argcheck * adding statistics initialization * debug nnsk batchlization and eigenvalues loading * debug nnsk * optimizing saving best checkpoint * Pr/44 (#19) * add comments QG * add comment QG * debug nnsk add orbital and strain * update `.npy` files loading procedure in DefaultDataset (#18) * optimizing init and restart param loading * update nnsk push thr * update mix model param and deeptb sktb param * BUG FIX in loading `kpoints.npy` files with `ndim==3` (#20) * bug fix in loading `kpoints.npy` files with `ndim==3` * added tests for nnsk training * main program for test_train * refactor test * update nrl * denote run --------- Co-authored-by: Sharp Londe <93334987+SharpLonde@users.noreply.github.com> Co-authored-by: qqgu <guqq_phy@qq.com> Co-authored-by: Qiangqiang Gu <98570179+QG-phy@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe Current Status and Possible Solution
For better fitting to the graph-based network, and enabling GPU and multi workers penalization, it is a common requirement to refactoring the data module with original packages supported by pytorch. To do so, we need to:
reframe structure(BaseStructure) class using torch-geometric.data.
instead of using processor, rewrote the function with datasets, inmemory datasets and dataloader with pytorch and torch-geomeric
Additional Context
No response
The text was updated successfully, but these errors were encountered: