Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request - add some locality attributes to FLAT? #8394

Open
1 task done
dustymc opened this issue Jan 2, 2025 · 5 comments
Open
1 task done

Request - add some locality attributes to FLAT? #8394

dustymc opened this issue Jan 2, 2025 · 5 comments
Labels
Priority-Normal (Not urgent) Normal because this needs to get done but not immediately.

Comments

@dustymc
Copy link
Contributor

dustymc commented Jan 2, 2025

Help us understand your request (check below):

  • other

Describe what you're trying to do

Make #8393 and #7348 and similar more efficient by caching some locality attribute summary data in FLAT.

I DO want to get the 'normal' stuff, the cost of pulling it dynamically is significant and limiting.

I DO NOT want to add anything to FLAT that won't get significant use; there are long-term costs in scalability, maintenance, and processing to this, and we're always resource limited.

Here's what I think we should add to the cache:

  • Stage/Age ---> stage_age
  • Series/Epoch ---> series_epoch
  • Era/Erathem ---> era_erathem
  • Eon/Eonothem ---> eon_eonothem
  • lithostratigraphic group ---> lithostratigraphic_group

ES collection folks, HELP!

@keg34
@javanveldhuizen
@mvzhuang
@KatherineLAnderson
@aklompma
@droberts49
@wellerjes
@jrpletch
@Nicole-Ridgwell-NMMNHS
@ehalverson26
@kat-sterner
@jessicatir
@ufarrell
@ronaldeng
@WaigePilson
@Kmullineaux

@Kmullineaux
Copy link

@dustymc I'm sure all the ones you highlighted would help. I think Era/Erathem would be particularly useful to our collections.

@mkoo
Copy link
Member

mkoo commented Jan 6, 2025

I DO NOT want to add anything to FLAT that won't get significant use; there are long-term costs in scalability, maintenance, and processing to this, and we're always resource limited.

With respect to this key point, what are the counts of values for locality attribute types? (https://arctos.database.museum/info/ctDocumentation.cfm?table=ctlocality_attribute_type#attribute_type) That MAY be a way to distinguish between useful stuff for flat. (i.e., Requests for little used attributes will not be considered!) thx

@mkoo mkoo added the Priority-Normal (Not urgent) Normal because this needs to get done but not immediately. label Jan 6, 2025
@dustymc
Copy link
Contributor Author

dustymc commented Jan 6, 2025

Usage:


        attribute_type        | count  
------------------------------+--------
 locality label               |      2
 lithodemic suite             |      3
 lithostratigraphic bed       |    202
 informal lithostratigraphy   |   1204
 informal chronostratigraphy  |   1936
 biostratigraphic zone        |   3196
 USGS HUC 8-digit             |   6081
 lithostratigraphic group     |   8200
 Eon/Eonothem                 |  10242
 TRS aliquot                  |  13629
 lithostratigraphic member    |  16826
 geology remarks              |  17672
 Stage/Age                    |  19085
 drainage                     |  20812
 biochron                     |  22294
 TRS section                  |  22696
 TRS range                    |  23135
 TRS township                 |  23238
 Era/Erathem                  |  30876
 biota remarks                |  34402
 landholder                   |  34610
 Series/Epoch                 |  39161
 System/Period                |  39444
 feature                      |  43283
 locality access              |  43289
 lithostratigraphic formation |  46958
 site found                   |  55281
 data management history      |  58277
 site identifier              |  83433
 quad                         |  92837
 previous geography           | 138364
 georeference source          | 710804

@Nicole-Ridgwell-NMMNHS
Copy link

Is System/Period already in FLAT?

Are we able to pull from the Chronostrat metadata to fill in higher Chronostrat ranks (for example, record is Albian, FLAT fills in Lower Cretaceous, Cretaceous, Mesozoic)? That would make having these values in FLAT more useful.

As a collection manager, I function just fine without Chronostrat in my search results because usually I'm using Chronostrat in my query and am more interested in lithostrat in my results BUT I think this would be valuable for other types of paleo users. Particularly System/Period, Series/Epoch, and Stage/Age.

@dustymc
Copy link
Contributor Author

dustymc commented Jan 7, 2025

already in FLAT

Don't think so, structure below.

pull from the Chronostrat metadata

I can write the code, but I think https://github.com/ArctosDB/internal/issues/330 is probably a hard blocker. (Maintaining the cache is already using nearly everything nearly always, and that secondary data is particularly expensive.) That's not a 'no' but definitely recommend that proceed with a lot of caution.

valuable for other types of paleo users

Same reasons as above, I'd recommend not proceeding without a solid use case. (FLAT is really handy for reports - what started this - but not necessary for search, for example.)


                                       Table "cache.flat"
              Column              |            Type             | Collation | Nullable | Default 
----------------------------------+-----------------------------+-----------+----------+---------
 collection_object_id             | integer                     |           | not null | 
 cat_num                          | character varying(40)       |           |          | 
 accn_id                          | integer                     |           | not null | 
 collection_id                    | integer                     |           | not null | 
 institution_acronym              | character varying(20)       |           |          | 
 collection_cde                   | character varying(5)        |           |          | 
 collection                       | character varying(50)       |           |          | 
 collecting_event_id              | integer                     |           |          | 
 verbatim_date                    | character varying(60)       |           |          | 
 last_edit_date                   | timestamp without time zone |           |          | 
 individualcount                  | integer                     |           |          | 
 collectors                       | character varying           |           |          | 
 field_num                        | character varying           |           |          | 
 othercatalognumbers              | character varying           |           |          | 
 genbanknum                       | character varying           |           |          | 
 relatedcatalogeditems            | character varying           |           |          | 
 typestatus                       | character varying           |           |          | 
 sex                              | character varying           |           |          | 
 parts                            | character varying           |           |          | 
 encumbrances                     | character varying           |           |          | 
 accession                        | character varying(81)       |           |          | 
 geog_auth_rec_id                 | integer                     |           |          | 
 higher_geog                      | character varying(255)      |           |          | 
 continent_ocean                  | character varying(50)       |           |          | 
 country                          | character varying(50)       |           |          | 
 state_prov                       | character varying(75)       |           |          | 
 county                           | character varying(50)       |           |          | 
 sea                              | character varying(50)       |           |          | 
 locality_id                      | integer                     |           |          | 
 spec_locality                    | character varying(255)      |           |          | 
 minimum_elevation                | double precision            |           |          | 
 maximum_elevation                | double precision            |           |          | 
 orig_elev_units                  | character varying(30)       |           |          | 
 min_elev_in_m                    | double precision            |           |          | 
 max_elev_in_m                    | double precision            |           |          | 
 dec_lat                          | double precision            |           |          | 
 dec_long                         | double precision            |           |          | 
 datum                            | character varying(55)       |           |          | 
 orig_lat_long_units              | character varying(20)       |           |          | 
 coordinateuncertaintyinmeters    | double precision            |           |          | 
 identification_id                | integer                     |           |          | 
 scientific_name                  | character varying(255)      |           |          | 
 identifiedby                     | character varying           |           |          | 
 remarks                          | character varying           |           |          | 
 habitat                          | character varying           |           |          | 
 associated_species               | character varying           |           |          | 
 taxa_formula                     | character varying(25)       |           |          | 
 full_taxon_name                  | character varying           |           |          | 
 phylclass                        | character varying           |           |          | 
 kingdom                          | character varying           |           |          | 
 phylum                           | character varying           |           |          | 
 phylorder                        | character varying           |           |          | 
 family                           | character varying           |           |          | 
 genus                            | character varying           |           |          | 
 species                          | character varying           |           |          | 
 subspecies                       | character varying           |           |          | 
 author_text                      | character varying           |           |          | 
 nomenclatural_code               | character varying           |           |          | 
 infraspecific_rank               | character varying           |           |          | 
 guid                             | character varying(67)       |           |          | 
 depth_units                      | character varying(20)       |           |          | 
 min_depth                        | double precision            |           |          | 
 max_depth                        | double precision            |           |          | 
 min_depth_in_m                   | double precision            |           |          | 
 max_depth_in_m                   | double precision            |           |          | 
 collecting_method                | character varying           |           |          | 
 collecting_source                | character varying(15)       |           |          | 
 verificationstatus               | character varying(40)       |           |          | 
 imageurl                         | character varying(121)      |           |          | 
 catalognumbertext                | character varying(40)       |           |          | 
 collectornumber                  | character varying           |           |          | 
 verbatimelevation                | character varying(84)       |           |          | 
 year                             | integer                     |           |          | 
 month                            | integer                     |           |          | 
 day                              | integer                     |           |          | 
 stale_flag                       | integer                     |           | not null | 0
 lastuser                         | character varying(38)       |           |          | 
 lastdate                         | timestamp without time zone |           |          | 
 partdetail                       | jsonb                       |           |          | 
 began_date                       | character varying(22)       |           |          | 
 ended_date                       | character varying(22)       |           |          | 
 id_sensu                         | character varying           |           |          | 
 preparators                      | character varying           |           |          | 
 verbatim_locality                | character varying           |           |          | 
 made_date                        | character varying(22)       |           |          | 
 event_assigned_by_agent          | character varying(255)      |           |          | 
 event_assigned_date              | timestamp without time zone |           |          | 
 specimen_event_remark            | character varying           |           |          | 
 specimen_event_type              | character varying(60)       |           |          | 
 coll_event_remarks               | character varying           |           |          | 
 verbatim_coordinates             | character varying(255)      |           |          | 
 collecting_event_name            | character varying(255)      |           |          | 
 georeference_source              | character varying           |           |          | 
 georeference_protocol            | character varying(255)      |           |          | 
 locality_name                    | character varying(255)      |           |          | 
 enteredby                        | character varying(255)      |           |          | 
 entereddate                      | timestamp without time zone |           |          | 
 cataloged_item_type              | character varying(20)       |           |          | 
 previousidentifications          | jsonb                       |           |          | 
 use_license_url                  | character varying           |           |          | 
 identification_remarks           | character varying           |           |          | 
 locality_remarks                 | character varying           |           |          | 
 formatted_scientific_name        | character varying           |           |          | 
 subfamily                        | character varying(255)      |           |          | 
 tribe                            | character varying(255)      |           |          | 
 subtribe                         | character varying(255)      |           |          | 
 has_tissues                      | integer                     |           |          | 
 taxon_rank                       | character varying(255)      |           |          | 
 last_edited_table                | character varying(255)      |           |          | 
 locality_search_terms            | character varying           |           |          | 
 json_locality                    | jsonb                       |           |          | 
 attributedetail                  | jsonb                       |           |          | 
 guid_prefix                      | character varying(50)       |           |          | 
 catalognumberint                 | integer                     |           |          | 
 lastpartlocation                 | character varying           |           |          | 
 organism_id                      | character varying           |           |          | 
 superfamily                      | character varying           |           |          | 
 last_refresh_date                | timestamp without time zone |           |          | 
 suborder                         | character varying           |           |          | 
 media                            | jsonb                       |           |          | 
 creator                          | character varying           |           |          | 
 subject                          | character varying           |           |          | 
 materials                        | character varying           |           |          | 
 copyright_holder                 | character varying           |           |          | 
 culture_of_origin                | character varying           |           |          | 
 culture_of_use                   | character varying           |           |          | 
 formation                        | character varying           |           |          | 
 member                           | character varying           |           |          | 
 event_count                      | integer                     |           |          | 
 locality_count                   | integer                     |           |          | 
 age                              | character varying           |           |          | 
 identifiers                      | jsonb                       |           |          | 
 superorder                       | character varying           |           |          | 
 preparatornumber                 | character varying           |           |          | 
 af                               | character varying           |           |          | 
 nk                               | character varying           |           |          | 
 specimen_event_id                | integer                     |           |          | 
 scientificnameid                 | character varying           |           |          | 
 related_record_cache             | jsonb                       |           |          | 
 detected                         | character varying           |           |          | 
 examined_for                     | character varying           |           |          | 
 not_detected                     | character varying           |           |          | 
 not_examined_for                 | character varying           |           |          | 
 ear_from_notch                   | character varying           |           |          | 
 hind_foot_with_claw              | character varying           |           |          | 
 tail_length                      | character varying           |           |          | 
 total_length                     | character varying           |           |          | 
 weight                           | character varying           |           |          | 
 reproductive_data                | character varying           |           |          | 
 doi                              | character varying           |           |          | 
 collection_preferred_identifiers | character varying           |           |          | 
 public_accn_id                   | integer                     |           |          | 
 rights                           | jsonb                       |           |          | 
 citations                        | jsonb                       |           |          | 
 media_tags                       | jsonb                       |           |          | 
 collector_agents                 | jsonb                       |           |          | 
 kt                               | character varying           |           |          | 
 continent                        | character varying(255)      |           |          | 
 ocean                            | character varying(255)      |           |          | 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority-Normal (Not urgent) Normal because this needs to get done but not immediately.
Projects
None yet
Development

No branches or pull requests

4 participants