sample_taxa

treeforall hackathon project: Given the request for a modestly sized tree for a large taxon T, support various useful ways to sample from taxon T, including (1) a random sample of species from T, (2) the species in T that have a genome in NCBI genomes, and (3) the top N species in T in terms of the number of occurrence records in iDigBio.

More info

see the google doc https://docs.google.com/document/d/1E3QIxEYUu4Q6A3Dc_zJUoxb0O0vEWX88_2ptYW1iMjg

Targets chosen by the group

choose N species randomly from T
choose those species from N that have property A, e.g., has NCBI genome
choose the top N species from T by relevance metric, e.g., counts in iDigBio

directories in the githup repo

arbor

data

Data files. There is a README.md in the data directory

doc

Instructions and documentation.

perl

See the README.md in the perl directory. This contains scripts to obtain the induced subtree for any species in a named taxon that have genomes in NCBI.

python

See the README.md in the python directory.

OpenRefine

Implementation of taxon sampling in Open Refine.

random_sample directory

Python code to sample randomly from a taxon.

tnrs_csv

Utility code (Python) to read a csv file, invoke the OT match_names service, and add the resulting matches as a new column in the csv file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sample_taxa

More info

Targets chosen by the group

directories in the githup repo

arbor

data

doc

perl

python

OpenRefine

random_sample directory

tnrs_csv

About

Releases

Packages

Contributors 5

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
arbor		arbor
data		data
doc		doc
open_refine		open_refine
perl		perl
python		python
random_sample		random_sample
tnrs_csv		tnrs_csv
README.md		README.md

arlin/sample_taxa

Folders and files

Latest commit

History

Repository files navigation

sample_taxa

More info

Targets chosen by the group

directories in the githup repo

arbor

data

doc

perl

python

OpenRefine

random_sample directory

tnrs_csv

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages