-
Notifications
You must be signed in to change notification settings - Fork 647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On-the-fly coordinate transformations #786
Comments
https://github.com/richardjgowers/MDAnalysis-coarsegraining I've already hacked together something like this for coarse-graining. The difference to other transforms mentioned is this changes the number of particles in the system as you're condensing many positions into one. Adding a pbc/undo pbc would be a cool transform. I think some subroutines along these lines exist ( @jbarnoud I see we're analysing the same system lately! :P |
@richardjgowers I've seen your CGUniverse class. I forgot to link it in the proposal. I am still not sure if I want this proposal to be able to change the topology, perhaps in as a second step. When I added the virtual particle I mention above, I really would have like to visualize them with nglview.
The tranforms exist already. But they have to be called on a per-frame basis. They will not be called by nglview for instance.
I hope you have fun with the system. I sure do :D |
I'd love to have fit+pbc and remap in such a way that the unit cell is not rotating around the fitted group. I don't know of any tool that can do this even though it is clear that it's possible: you just have to carve the same Wigner-Seitz unitcell out of the inifinitely repeated (and rotated) system. |
@orbeckst Since the transformations get to alter the |
So I like this, I'd just change it so that the object which accepts transformations is the Reader, so all your |
We could store these just as a list in the Reader, then inside the Reader we'd have code like.. def _read_next_timestep(self):
ts = self._read_next()
for transform in self.transformations:
transform(self)
return ts So each from MDAnalysis.transformations import MakeWhole
u.trajectory.transforms.append(MakeWhole(u.bonds)) |
@richardjgowers This is exactly what I had in mind. I will try to come up with a working prototype in the next few days (with one or two transformations to play with). But I will be rather busy so do not hold your breath. |
|
@jbarnoud so essentially adding another layer between I might have a play with where's best for this... part of me thinks that you could use a decorator like @with_transforms
def next(self):
return self._read_next_timestep() |
Basically yes. But the decorator may be overkill. I was thinking about something like: def _read_next_timestep(self):
ts = self._read_next() # Overloaded by the reader
for transformation in self._transformations:
ts = transformation(ts)
return ts |
Yes @jbarnoud solution looks good. Though I would put it directly into the |
@kain88-de Except that The change in the readers would be small, just a matter of changing the name of a method where it is defined. Not even where it is called. |
So with how #868 has worked out, I don't really see why this can't be implemented using the aux mechanisms. Maybe we subclass aux's to get them to be called transforms everywhere, but I think the mechanics look sound. |
@richardjgowers I have a clear idea on how it can get implemented alongside aux, and a rough idea on how it can interact with aux. Could you elaborate on how it could get implemented uning aux? |
def _read_frame_with_aux(self, frame):
"""Move to *frame*, updating ts with trajectory and auxiliary data."""
ts = self._read_frame(frame)
for aux in self.aux_list:
ts = self._auxs[aux].read_ts(ts)
return ts With the snippet above, there's no reason with |
@davidercruz this is all relevant to you |
Hey there. After talking with @jbarnoud, he proposed that I look at the reader codes and make a scheme of what happens when we iterate over a trajectory. Here's what I found: When the user loads a trajectory through the MDAnalysis Universe class eg.
When the user requests the another frame of the trajectory eg.
@richardjgowers, I saw #1215 . From what I understand, the transformations would be applied every time a frame is loaded from the file. But what happens when the trajectory is loaded into the memory? Is the frame actually modified permanently? @jbarnoud was telling me that the A solution would be to apply the transformations when the trajectory is transferred to the memory. What do you think? |
@davidercruz yep memoryreader will be a headache. I agree doing it all up front is the best solution, so each time a transformation is added, the (in-memory) trajectory is iterated through and the transformation applied. (You could maybe come up with some solution which lazily applies a transformation the first time a frame was requested, but that just sounds like asking for trouble.) Your schematic of what happens when iterating looks solid, there's also You could redefine |
@richardjgowers yes, that place avoids changes to the format readers. My only issue is that it's not what the function was initially designed to due. But code evolves :D What about changing things at the reader level? For example, the |
…unction (Issue MDAnalysis#786) Transformations are applied by the readers, either by being added by the user or passed as a kwarg when instancing the universe; added tests for the modified reader classes: ProtoReader, SingleFrameReader, ChainReader and MemoryReader
This proposal is now implemented. |
On-the-fly coordinate transformations
So far, the only transformation MDAnalysis does to coordinates read from a trajectory is a unit conversion. Any other transformation must be done by th user on a frame per frame basis. Yet, in some use case, the user does not directly access the frame, and therefore cannot apply the transformations. This is especially the case with analyses and visualizations. Hereby, I propose a general mechanism to declare coordinate transformations that will be applied by the reader.
Use cases
The main use cases for the proposal are analyses and visualizations that require structure alignements or periodic boundary corrections.
Analyses such as RMSD calculations require the structures to be aligned. So far, the RMSD class control the fit with a keyword, and only implement a single way of doing that fit. This means that if the user wants to do a different fit, he must save an intermediate trajectory. It also means an other analysis that require a fit needs to implement the way to handle it; such implementation being redundant and potentially inconsistent with the other analyses.
@dotsdl posted recently a blog post on nglview. This library allows to visualize a trajectory from MDAnalysis in a jupyter notebook. Like most (if not all) visualization software, nglview does not fix the periodic boundary conditions, which can lead to ugly artifact with bond crossing the box. MDAnalysis could fix the periodic box in a way transparent for the visualization library.
Also, some transformations are painful to do with the tools embedded in the simulation packages, and can require multiple call to tools like
gmx trjconv
with several intermediate trajectory. With the proposed mechanism it would be possible to declare a workflow of transformations in MDAnalysis and apply them frame by frame, without intermediate files.New APIs
User facing API
Only few changes are needed in the user facing API: a minima a method should be added to
Universe
to register a transformation, few methods could also be added to to inspect and modify the transformation workflow.A transformation is implemented as a callback that takes a
TimeStep
as argument and modify in on place. Using a callback means that a transformation can be implemented in the simple way possible for that transformation. The simplest transformations can be implemented as a function, so as transformations that require a constant argument:The more complicated transformations can be implemented as classes with a
__call__
method.Ideally, transformations that require access to the topology (e.g. making molecule whole) can access it via the
universe
attribute of theTimeStep
for a minimum burden of the user.It would be nice to be able to add several transformation in one go as a workflow:
Internal API
Internally, the
Universe
class must allow to register transformations and must pass them to thecoordinate.base.ProtoReader
class that apply them, in order, when reading a new frame.This proposal share problems with proposal #785 by @richardjgowers. Both proposal need a way to declare method, and to execute them when reading a new frame. Auxiliary data should be read before executing the coordinate transformations so the transformation can use the auxiliary data.
Optionally, it could be useful to allow a coordinate transformation to add auxiliary (e.g. rotation matrix used for the fitting). This should be trivial if auxiliary data are attached the
TimeStep
.Collection of transformations
For this proposal to be really useful, MDAnalysis must have a collection of transformations ready to use. The more obvious ones are the already implemented structure fitting, and PBC correction methods.
Limitations
Some exiting transformation require to change the topology. It is the case to any transformation that adds virtual particle, or any transformation that coarse grain or back map a trajectory.
For exemple, I recently wrote an script that adds virtual particle to each frame of a trajectory to test the possible bias of an analysis. Also, several of us work with coarse-grained models and need to analyse atomistic simulation as if they were coarse-grained.
These transformations require to adapt the topology of the system, and I do not see a clean way of doing it. Also, it is not clear if such transformations are in the scope of this proposal (even though I would love to be able to use them).
The text was updated successfully, but these errors were encountered: