-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alchemlyb tutorial in google colab #196
Comments
I was trying to build a notebook using my multiple xvg files, however I did not get to convert the multiple xvg files (from each lambda) to a dataset similar to benzene_load |
In my notebook, It works fine if I use the benzene_load dataset, but I did not get to convert my personal multiple xvg files (from each lambda) into a dataset to work with my notebook. |
@andresilvapimentel Hi, thank you for your interest in alchemlyb
Do you mind have a try and give me some feedback? |
I am not sure if you understood what I would like to do it. |
Hi @andresilvapimentel, can you share the block of code you are running to load your XVG files, along with the exception/error you get? It's impossible for us to tell what the issue may be without this information. |
HI @dotsdl , You asked exactly what I am asking for. I do not know how to load my xvg files. I the alchemlyb tutorial, the developers used: from alchemtest.gmx import load_benzene My question is how to change the line: to upload my personal xvg files. So, my question is: How to convert my personal xvg files to a dataset? |
Ah, now I see what you mean. I'll be the first to admit that we could use some clear examples in the documentation, but I believe you are referring to the example given in this section of the docs: from alchemtest.gmx import load_benzene
from alchemlyb.parsing.gmx import extract_dHdl
dataset = load_benzene()
dhdl = extract_dHdl(dataset['data']['Coulomb'][0], 310)
dhdl.attrs['temperature']
dhdl.attrs['energy_unit'] If you run the In the Does this help? |
The other error message shows up when I run the command block: dHdl_coul = alchemlyb.concat([extract_dHdl(xvg, T=310) for xvg in dataset['Coulomb']]) from alchemlyb.visualisation import plot_ti_dhdl The error message is: ValueError Traceback (most recent call last) /usr/local/lib/python3.7/dist-packages/alchemlyb/visualisation/ti_dhdl.py in plot_ti_dhdl(dhdl_data, labels, colors, units, ax) ValueError: Length of labels (2) should be the same as the number of data (4) Do you know what is going on? |
From what you shared, it looks like:
should have instead for the last line:
Does this fix your issue? |
Hi @dotsdl Thank you for your suggestion. No. I did not fix it. It is even worse because it gives the error message in the previous line: KeyError Traceback (most recent call last) KeyError: 'data' Yesterday, I found a couple additional issues with this uploading process. I will try a couple more things before sharing with you. |
I think this could be solved by using autoMBAR instead of MBAR |
I think this is a tricky issue, the plot_ti_dhdl assumes a certain form of input data, I might need your file to see what is the best way forward. |
Hi @xiki-tempula Thank you for your comment. I also found that the data_list generated in my notebook is in a format different for what is required for the code works well. I would like to suggest something. My recomendation is to work in a module to read the xvg files and generate a dataset with I will work a little bit more. If I do not solve the issue, I will send my notebook and dataset to get further help. Which file do you need? The xvg files are big. |
@andresilvapimentel So the alchemlyb is intended to function as a library instead of offering an end-to-end solution, so technically, all the components to do a convergence analysis are available. We can obviously giving advice on how to use them. |
I tried to use from alchemlyb.estimators import AutoMBAR as MBAR but It did not work as well giving the same error message about convergence. |
@andresilvapimentel If the AutoMBAR cannot provide a solution, I'm afraid that it is out of the scoop of this repo. Our AutoMBAR is just a wrapper which tries different MBAR solvers to get it to work. In most cases, the AutoMBAR can find the right solver to solve the MBAR but ultimately, it is the pymbar that is doing the solving. |
@xiki-tempula How can I send the xvg files in order to help me with solution? What is the best way? Maybe, I am generating the xvg files different than you. Do you aggree with me? |
@andresilvapimentel It is proprietary? If it is not, you could just put it in the comment section. |
@xiki-tempula It is my own xvg files: I would be very thankful if you can solve the issue. |
@andresilvapimentel Ok, I guess you need the devel version, follow the guide in https://alchemlyb.readthedocs.io/en/latest/install.html#installing-from-source I could get the result with the devel version by
Note that your case seems to be very difficult to solve, so the calculation will take a long time. |
The u_nk_list used by you has the same form of the data_list = [extract_u_nk(xvg, T=310) for xvg in dataset['Coulomb']] that I used, so it gave the same error message using the mbar. I did not use the autombar because I need to install it. I did not try yet. |
It did not converge after 1 h when the devel version was intalled. WARNING: Did not converge to within specified tolerance. |
As I did not solve the issue, I run the tutorial of the mdpow program using the benzene example to get the FEP data (from the xvg files). Then, I uploaded these xvg files into the alchemlyp notebook to try running this benzene example there.
LinAlgError Traceback (most recent call last) <array_function internals> in eigh(*args, **kwargs) /usr/local/lib/python3.7/dist-packages/numpy/linalg/linalg.py in _raise_linalgerror_eigenvalues_nonconvergence(err, flag) LinAlgError: Eigenvalues did not converge
LinAlgError Traceback (most recent call last) LinAlgError: Eigenvalues did not converge
Warning: BAR is likely to be inaccurate because of poor overlap. Improve the sampling, or decrease the spacing betweeen states. For now, guessing that the free energy difference is 0 with no uncertainty. LinAlgError Traceback (most recent call last) /usr/local/lib/python3.7/dist-packages/numpy/linalg/linalg.py in _raise_linalgerror_eigenvalues_nonconvergence(err, flag) LinAlgError: Eigenvalues did not converge It is weird this warning message (Warning: BAR is likely to be inaccurate because of poor overlap. Improve the sampling, or decrease the spacing betweeen states. For now, guessing that the free energy difference is 0 with no uncertainty.) because the tutorial should not have this issue. Can you help me, please? I would appreciate if you can help me because I do not have any idea how to solve this issue. |
The benzene example for MDPOW is almost certainly not converged, the simulations are too short. I would suggest you start out with a simple system where you know that you can get converged data. This can be benzene but you should run it for long enough (probably a few ns per window — but ultimately you should do some convergence analysis). Or use the GROMACS FEP tutorial. In any case, you should have a test case that you trust. Manually calculating the free energy difference over the whole data set as you did in (2) and (3) is a good start. Try different estimators. Adjust convergence parameters. Then do the same manually including only half the data. See if anything breaks in between. Ultimately, alchemlyb is not a plug-and-play solution. It provides building blocks and you need to develop an understanding of what the building blocks can do. I suggest reading recent papers on FEP calculations (Justin Lemkuhl's GROMACS tutorial (1) and the best practices paper (2).
|
@orbeckst @xiki-tempula , I've started working on moving testing different solutions within MBAR at choderalab/pymbar#442. Would be good to get any comments on what you would suggest. |
As a note: Google Colab is Python version 3.7.13 (default, Apr 24 2022, 01:04:09) (as of this writing). alchemlyb will likely drop Python 3.7 support soon (in line with NEP29, see e.g. PR #214 ) to keep up with dependencies such as Pandas (supports 3.8 – 3.10 in their latest release). Therefore, unless Google updates Colab, it seems pointless to create tutorials in Colab if we cannot use the latest version of alchemlyb. |
Minimum requirement for alchemlyb is 3.8 so it won't run in Colab until Google changes the version of Python. Please re-open once Colab has been modernized. |
It would be nice if the developers write an alchemlyb tutorial in google colab using a simple ligand/protein complex example.
The text was updated successfully, but these errors were encountered: