This document contains the structure of Journals having Impact Factor greater than 15 - taken from 10x Genomics. Only relevant sections from the journals are added.
- R
- Python
- Github
- Zenodo
- European Genome-phenome Archive (EGA)
- Gene Expression Omnibus (GEO)
- Sequence Read Archive (SRA)
- International Nucleotide Sequence Database Collaboration (INSDC)
- European Nucleotide Archive (ENA)
- BioStudies: ArrayExpress (Gene expression data)
- database of Genotypes and Phenotypes (dbGaP)
- ProteomeXchange Consortium - PRIDE: stores data related to protein and peptide identifications
- ProteomeXchange Consortium - PeptideAtlas: stores datasets focusing on peptide identifications and their associated spectral evidence
- ImmPort (Immunology Database and Analysis Portal)
- Zenodo
- China National GeneBank DataBase (CNGBdb)
- BioProject
- Medical Genomics Japan Variant Database (MGeND)
- CellxGene
- National Genomics Data Center (NGDC): Genome Warehouse, Genome Sequence Archive (GSA)
- The Neuroscience Multi-omic Data Archive (NeMo)
Structure 01:
- contains info regarding sequencing data and source code
- if code unavailable > “This paper does not report original code” (MUST BE ADDED but not always added)
- code available upon request from authors
- already-existing SWs and Algorithms used but no unique code generated - found in Key Resources Table
- ‘original code’
- ‘unique code’
- ‘custom code’
- github link for original source code or custom code generated by using already-existing SW
Structure 02:
- contains info regarding sequencing data and source code
Structure 03: (in publications from 04/2011 to 2016)
- mostly written in plain text without a URL
(some publications are not publicly accessible)
- requests for resources and reagents
- section not found in several publications
- sequencing data listed but not source code
- 'original code'
- 'novel code'
- accession numbers of sequencing data & source code URL (Github/Zenodo) listed in Key Resources Table
- accession number stated in plain text instead of hyperlinks/URLs
- Cancer Cell: 48.8
- Immunity: 25.5
- Cell Host & Microbe: 20.6 (no publication has added source code or accession codes - resources available upon request)
- Cell Metabolism: 27.7
Structure 01:
- uploaded on database(s) mentioned above
- available upon request
- fasta files found under Supplementary Information as Supplementary Data (observed in only one publication so far)
- found under Source Data: ‘Fig.’ - .xlsx files
- ‘custom code’
- ‘custom software’
- ‘new codes’
- ‘customized code’
- not provided
- available from authors upon request
- no new code, used existing code
Structure 02:
- contains URLs to sequencing data
NOTE: before 2019: Accession Codes, in 2019: Data Availability, 2020 onwards: Data Availability, Code Availability
- Nature Biotechnology: 33.1
- Nature Medicine: 58.7
- Nature Genetics: 31.7
- Nature: 50.5
- Nature Cell Biology: 17.3
- Nature Neuroscience: 21.2
- Nature Cancer: 23.5
- Nature Methods: 36.1
- Nature Metabolism: 18.9
- Nature Biomedical Engineering: 26.8
- Nature Microbiology: 20.5
- Cellular & Molecular Immunology: 21.8
- Nature Nanotechnology: 38.1
- contains accession numbers for sequencing data and source code URLs
- accession numbers stated in plain text instead of hyperlink
- The Lancet Oncology: 41.6
- The Lancet Infectious Diseases: 36.4
- accession numbers in plain text/ links to sequencing data in References (Zenodo URLs)/ hyperlinks
- source code: hyperlinks, sometimes in References, not available at all
- several publications are not freely accessible
- contains accession numbers for sequencing data and source code URLs
- Science Translational Medicine: 16.9
- Science Bulletin: 18.8
Structure 01:
Data Availability/ Data Deposition/ Data and Software Availability/ Data and Materials Availability/ Data Reporting/ Code Available/ Data Archive (2019-present)
- contains accession numbers for sequencing data and source code URLs
- accession numbers stated in plain text instead of hyperlinks/URLs
- very few publications contain source code
- contains accession numbers of publicly available data used by authors
Structure 02:
- information related to sequencing data
- source code
- contains information regarding sequencing data
- few publications mention source code
- very few mentioned source code
- accession numbers stated in plain text instead of hyperlink
- contains accession codes and source code
- contains accession codes (plain text) and source code (2 out of 13 publications)
- accession codes (plain text)
- source code found (1 out of 23 publications)
- available upon request
- contains info regarding sequencing data and source code (1 out of 4 publications)
Structure 01:
- accession number/acession code/project number found in plaintext instead of hyperlink
- source code available upon request or not available at all - available on Github for only 1 out of 23 publications
Structure 02:
- 1 out of 7 publications stated accession number
- data available upon request
- no data available at all
- no source code
- only accession numbers found (2/5 publications) - no source code
- available upon request
- section not found
- accession numbers stated in plaintext
- code available upon request or not available at all
- accession numbers in plain text
- data available upon request
- source code not available for any publication
Structure 01:
- accession numbers stated in plain text
- source code URLs stated (1 out of 19 publications)
- available upon request
Structure 02:
- data available upon request
- 1/4 publications stated sequencing data accession number (plain text)
- no source code found
- 3/4 publications contain accession numbers
- 1/4 publication's data is available upon request
- data available upon request
- 1/7 publication shared source code URL
- 1/7 publication shared correct accession number with name of the Database
General Observation: Most publications started adding source code from 2019 onwards
- results show around 8000 articles but scrapes only 999 articles
- search hits go till page 50, each page contains 20 articles
- faulty sorting: ascending starts from 1880 and goes till 2019 (page 50), descending starts from 2024 and goes till 2023 (page 50)