-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pdb files too big generated on GEMC_NpT for opening/transferring #350
Comments
By setting the CoordinatesFreq to this value for a simulation run with this number steps, you are producing output 50 times. Since you are generating about 2MB of data for each output, you can adjust the CoordinatesFreq value to output less often. You could also do as you suggest and set the CoordinatesFreq to match the number of steps in your simulation to output only the final configuration. |
@LSchwiebert , thanks for your reply. But would there be an easy way for me to just extract this final configuration from this huge pdb file ? |
@mleao882 set "RestartFreq true LastStepNumber" in the config file. This will only output the PDB at that step. Alternatively, you could checkout the development branch, which supports DCD coordinates. Process for cloning/building the dev branch: Add this to your conf file This will be 100x smaller file than the PDB trajectory. |
I think someone else in the collaboration would be better able to answer your question, but I believe it is a human-readable file, so you should be able to open it with an editor and extract the last part. I would search for the header in the file. If you have to open it with VMD or some similar tool, then you are stuck back at your original problem. |
@mleao882 in order to extract the final configuration, you can use vmd.
The only issue with this solution is that occupancy and beta column data will be set to the value of frame 0. This is because VMD only reads the occupancy and beta value only from frame 0 and does not update them for each frame. In above script, I set their value to zero to avoid any problem/mistake. In GOMC, we use occupancy column to define if molecule is in the box or not. However, we set the coordinates of the molecule that does not exist in the box, to zero. So, you can exclude zero coordinates (X,Y,X) in your analysis. As @GregorySchwing mentioned, we now support binary coordinates to store coordinates with higher precision and less size. Similar to NAMD, GOMC prints the trajectory coordinates (.dcd) in single precision and restart coordinate (.coor) in double precision. In addition, similar to restart PDB file, GOMC now output restart PSF file for each simulation box. |
While running GEMC_NpT simulations, the pdb files generated for both boxes (L and V) are bigger than 100 MB and this is causing me trouble while transferring from the supercomputer to my machine and also trying to open it inside the supercomputer, and because of it, also a problem for analysis on VMD. After checking the input file used, I believe this might be due to the CoordinatesFreq used. I had set up the value to 10,000N (N being the number of molecules in the simulation), while the simulation was allowed to run for 500,000N. Is there any suggestion or anything I could do to sort this problem out ? Maybe just extracting the final configuration inside the pdb file would be ok for me. Thanks in advance !
The text was updated successfully, but these errors were encountered: