Lessons Learned #36

zfinannee · 2016-10-20T18:05:31Z

We should gather Lessons Learned from the process of developing the Prototype. Could include things such as what worked / needed improvement about the Use Case, what we'd do the same or differently about the development process and tools, etc etc etc - whatever might be helpful to us in the next AGR project.

There is a Lessons Learned Doc in the Working Groups / Portal / Implementation directory for brainstorming ideas, so anyone should feel free to add comments in that doc or in this issue - whatever is easier.

gabinkley · 2016-11-02T22:45:12Z

data format needs to be well-defined, consistent vocabulary, specify expected data types, identity required and option data
it was very slow to obtain data from the mines via the API, if we want to use mine data it would be better to dump the data into a specified file format
data from external sources (GO, OMIM, Panther, etc.) should be obtained from the external sites, the MODs should just provide the gene associations (annotation data)
would have been helpful to have a "product owner" to coordinate and manage the suggestions/tasks
might have been better to develop the prototype interface using Material-UI, rather than Bootstrap
the production web site would benefit from a trained designer

cmpich · 2016-11-02T22:54:58Z

Could you specify how long it took to extract the data from each mine? To get an idea how bad it is?
Why should we get external sources from external sites rather than the mines?
What is bad about 'Bootstrap'?

gabinkley · 2016-11-02T22:59:43Z

Pedro will need to respond about the mine API timing. Travis can comment on Bootstrap versus Material-UI - but that could be a discussion for the technical call.

With respect to why we should get external data from the external sources - they are the authority on that data. For example getting GO terms, definitions, and synonyms from GO is the right thing to do - not from each individual MOD. Less chance for unintended errors.

ragingsquirrel3 · 2016-11-02T22:59:52Z

My take on the bootstrap thing isn't that it's bad, but more that "It would be interesting to see what a material UI version would look like." I think using bootstrap was fine, the main reason being that more people have experience with it.

jogoodma · 2016-11-03T01:31:45Z

I posted this in slack a while back. Adding it here since it is relevant to this issue.

At FlyBase we have used both material-ui and react-bootstrap on a few recent projects. As a result, I would not recommend material-ui at this point in time due to some serious performance issues that we ran into with several components (mui/material-ui#3289, mui/material-ui#2832). I’m sure they will get worked out with time, but some of it has to do with their use of inline styles, which is fairly core to their library. We’ve looked at a few other material design libraries for react (react-toolbox, etc.) but haven’t gotten our hands dirty with them yet.

julie-sullivan · 2016-11-03T09:46:56Z

@gabinkley @cmpich Can you tell me which mine was slow? In our experience (being in the UK) we have found requests to the MOD mines to be very very fast. There is no reason why the requests should ever be anything less than (almost) instant.

If you let us know what sort of problems you are having, we can troubleshoot and fix.

pedrohr · 2016-11-03T18:46:23Z

About the mines... I was the one who suggested that "Lesson learned". The problem is not the mines at all, it's just the time it takes to get the data and it's a lot of data. Everything is working fine and there's nothing to be fixed. The scripts fetch the data every time we are indexing the ElasticSearch so it takes a long time and makes development a bit hard. But I included a step that saves the python objects in disk so we can just load them. I know we can also export the results from the mines just once since the data in the mines won't change so often.

And for production this would not be a problem, since indexing doesn't impact the service (most of the times :)

Here are the times it takes to get the data we need:

For genes:

mines

MouseMine: ~13 minutes
ZFinMine: ~6 minutes
YeastMine: ~ 50 seconds
dump files

WormBase: ~11 seconds
FlyBase: ~ 0.7 second
RGD: ~ 8 seconds

For GO:

mines

MouseMine: ~2 minutes
ZFinMine: ~ 20 seconds
YeastMine: ~ 4 minutes
dump files

WormBase: ~13 seconds
FlyBase: ~ 8 second
RGD: ~ 4 seconds

For diseases:

mines

MouseMine: ~ 3 seconds
ZFinMine: ~ 2 seconds
YeastMine: ~ 1 second
dump files

WormBase: < 1 second
FlyBase: < 1 second
RGD: < 1 second

julie-sullivan · 2016-11-04T16:18:49Z

@pedrohr Thank you!! That's helpful. I see what you mean now. I didn't realise how big those queries were.

Your plan for only running the query on update sounds sensible! You can use web services to find out the version number:

http://iodocs.apps.intermine.org/mgi/docs#/ws-version/GET/version/release

pedrohr · 2016-11-04T18:38:45Z

Nice! Thank you Julie!

pedrohr · 2016-11-09T21:02:40Z

There's another lesson that we should think about the data format, particularly in terms of business rules of what fields should be required or optional, and what to do when data is incomplete.

I had to make a few decisions on whether or not to discard entries from the mines.

For example, I discarded genes that had no symbols from the mines. But I didn't discard genes that had no chromosome specified, or even multiple. For this prototype, I believe it's ok, but for the future, we have to write down those rules.

zfinannee · 2016-11-17T21:14:17Z

I'm copying the comments previous to this one to a document where I'm combining them with the Lessons Learned google doc. Any additional comments will need to be added by hand to the combined doc - https://docs.google.com/document/d/1xsFE534b7e17mjp37cnyC1pbwxcG3O4mbqAumSpphIQ/edit

selewis · 2016-11-17T22:08:59Z

Would you tag these with the working groups they affect please.

On Thu, Nov 17, 2016 at 1:14 PM, Anne Eagle notifications@github.com
wrote:

I'm copying the comments previous to this one to a document where I'm
combining them with the Lessons Learned google doc. Any additional comments
will need to be added by hand to the combined doc -
https://docs.google.com/document/d/1xsFE534b7e17mjp37cnyC1pbwxcG3
O4mbqAumSpphIQ/edit

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#36 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABcuEG3ML3OCQhG-JXzix3m-S-RfLlrCks5q_MOpgaJpZM4KcbiA
.

zfinannee added the documentation label Oct 20, 2016

zfinannee changed the title ~~Lesson Learned~~ Lessons Learned Oct 20, 2016

gabinkley mentioned this issue Nov 10, 2016

Search enhancements #39

Closed

cmdcolin mentioned this issue Apr 7, 2021

Virtualized tree for tracklist to support having thousands of tracks GMOD/jbrowse-components#1867

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lessons Learned #36

Lessons Learned #36

zfinannee commented Oct 20, 2016 •

edited

Loading

gabinkley commented Nov 2, 2016

cmpich commented Nov 2, 2016

gabinkley commented Nov 2, 2016

ragingsquirrel3 commented Nov 2, 2016

jogoodma commented Nov 3, 2016

julie-sullivan commented Nov 3, 2016

pedrohr commented Nov 3, 2016 •

edited

Loading

julie-sullivan commented Nov 4, 2016

pedrohr commented Nov 4, 2016

pedrohr commented Nov 9, 2016

zfinannee commented Nov 17, 2016

selewis commented Nov 17, 2016

Lessons Learned #36

Lessons Learned #36

Comments

zfinannee commented Oct 20, 2016 • edited Loading

gabinkley commented Nov 2, 2016

cmpich commented Nov 2, 2016

gabinkley commented Nov 2, 2016

ragingsquirrel3 commented Nov 2, 2016

jogoodma commented Nov 3, 2016

julie-sullivan commented Nov 3, 2016

pedrohr commented Nov 3, 2016 • edited Loading

julie-sullivan commented Nov 4, 2016

pedrohr commented Nov 4, 2016

pedrohr commented Nov 9, 2016

zfinannee commented Nov 17, 2016

selewis commented Nov 17, 2016

zfinannee commented Oct 20, 2016 •

edited

Loading

pedrohr commented Nov 3, 2016 •

edited

Loading