Skip to content
This repository has been archived by the owner on Mar 24, 2021. It is now read-only.

Lessons Learned #36

Open
zfinannee opened this issue Oct 20, 2016 · 12 comments
Open

Lessons Learned #36

zfinannee opened this issue Oct 20, 2016 · 12 comments

Comments

@zfinannee
Copy link

zfinannee commented Oct 20, 2016

We should gather Lessons Learned from the process of developing the Prototype. Could include things such as what worked / needed improvement about the Use Case, what we'd do the same or differently about the development process and tools, etc etc etc - whatever might be helpful to us in the next AGR project.

There is a Lessons Learned Doc in the Working Groups / Portal / Implementation directory for brainstorming ideas, so anyone should feel free to add comments in that doc or in this issue - whatever is easier.

@zfinannee zfinannee changed the title Lesson Learned Lessons Learned Oct 20, 2016
@gabinkley
Copy link

  • data format needs to be well-defined, consistent vocabulary, specify expected data types, identity required and option data
  • it was very slow to obtain data from the mines via the API, if we want to use mine data it would be better to dump the data into a specified file format
  • data from external sources (GO, OMIM, Panther, etc.) should be obtained from the external sites, the MODs should just provide the gene associations (annotation data)
  • would have been helpful to have a "product owner" to coordinate and manage the suggestions/tasks
  • might have been better to develop the prototype interface using Material-UI, rather than Bootstrap
  • the production web site would benefit from a trained designer

@cmpich
Copy link
Contributor

cmpich commented Nov 2, 2016

Could you specify how long it took to extract the data from each mine? To get an idea how bad it is?
Why should we get external sources from external sites rather than the mines?
What is bad about 'Bootstrap'?

@gabinkley
Copy link

Pedro will need to respond about the mine API timing. Travis can comment on Bootstrap versus Material-UI - but that could be a discussion for the technical call.

With respect to why we should get external data from the external sources - they are the authority on that data. For example getting GO terms, definitions, and synonyms from GO is the right thing to do - not from each individual MOD. Less chance for unintended errors.

@ragingsquirrel3
Copy link
Contributor

My take on the bootstrap thing isn't that it's bad, but more that "It would be interesting to see what a material UI version would look like." I think using bootstrap was fine, the main reason being that more people have experience with it.

@jogoodma
Copy link
Contributor

jogoodma commented Nov 3, 2016

I posted this in slack a while back. Adding it here since it is relevant to this issue.

At FlyBase we have used both material-ui and react-bootstrap on a few recent projects. As a result, I would not recommend material-ui at this point in time due to some serious performance issues that we ran into with several components (mui/material-ui#3289, mui/material-ui#2832). I’m sure they will get worked out with time, but some of it has to do with their use of inline styles, which is fairly core to their library. We’ve looked at a few other material design libraries for react (react-toolbox, etc.) but haven’t gotten our hands dirty with them yet.

@julie-sullivan
Copy link

@gabinkley @cmpich Can you tell me which mine was slow? In our experience (being in the UK) we have found requests to the MOD mines to be very very fast. There is no reason why the requests should ever be anything less than (almost) instant.

If you let us know what sort of problems you are having, we can troubleshoot and fix.

@pedrohr
Copy link
Contributor

pedrohr commented Nov 3, 2016

About the mines... I was the one who suggested that "Lesson learned". The problem is not the mines at all, it's just the time it takes to get the data and it's a lot of data. Everything is working fine and there's nothing to be fixed. The scripts fetch the data every time we are indexing the ElasticSearch so it takes a long time and makes development a bit hard. But I included a step that saves the python objects in disk so we can just load them. I know we can also export the results from the mines just once since the data in the mines won't change so often.

And for production this would not be a problem, since indexing doesn't impact the service (most of the times :)

Here are the times it takes to get the data we need:

For genes:

  • mines

    MouseMine: ~13 minutes
    ZFinMine: ~6 minutes
    YeastMine: ~ 50 seconds

  • dump files

    WormBase: ~11 seconds
    FlyBase: ~ 0.7 second
    RGD: ~ 8 seconds

For GO:

  • mines

    MouseMine: ~2 minutes
    ZFinMine: ~ 20 seconds
    YeastMine: ~ 4 minutes

  • dump files

    WormBase: ~13 seconds
    FlyBase: ~ 8 second
    RGD: ~ 4 seconds

For diseases:

  • mines

    MouseMine: ~ 3 seconds
    ZFinMine: ~ 2 seconds
    YeastMine: ~ 1 second

  • dump files

    WormBase: < 1 second
    FlyBase: < 1 second
    RGD: < 1 second

@julie-sullivan
Copy link

@pedrohr Thank you!! That's helpful. I see what you mean now. I didn't realise how big those queries were.

Your plan for only running the query on update sounds sensible! You can use web services to find out the version number:

http://iodocs.apps.intermine.org/mgi/docs#/ws-version/GET/version/release

@pedrohr
Copy link
Contributor

pedrohr commented Nov 4, 2016

Nice! Thank you Julie!

@pedrohr
Copy link
Contributor

pedrohr commented Nov 9, 2016

There's another lesson that we should think about the data format, particularly in terms of business rules of what fields should be required or optional, and what to do when data is incomplete.

I had to make a few decisions on whether or not to discard entries from the mines.

For example, I discarded genes that had no symbols from the mines. But I didn't discard genes that had no chromosome specified, or even multiple. For this prototype, I believe it's ok, but for the future, we have to write down those rules.

@zfinannee
Copy link
Author

I'm copying the comments previous to this one to a document where I'm combining them with the Lessons Learned google doc. Any additional comments will need to be added by hand to the combined doc - https://docs.google.com/document/d/1xsFE534b7e17mjp37cnyC1pbwxcG3O4mbqAumSpphIQ/edit

@selewis
Copy link
Contributor

selewis commented Nov 17, 2016

Would you tag these with the working groups they affect please.

On Thu, Nov 17, 2016 at 1:14 PM, Anne Eagle notifications@github.com
wrote:

I'm copying the comments previous to this one to a document where I'm
combining them with the Lessons Learned google doc. Any additional comments
will need to be added by hand to the combined doc -
https://docs.google.com/document/d/1xsFE534b7e17mjp37cnyC1pbwxcG3
O4mbqAumSpphIQ/edit


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#36 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABcuEG3ML3OCQhG-JXzix3m-S-RfLlrCks5q_MOpgaJpZM4KcbiA
.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants