Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add visibilityWindow for WigTrack #1540

Merged
merged 1 commit into from
Sep 5, 2022
Merged

Conversation

kpalin
Copy link
Contributor

@kpalin kpalin commented Sep 1, 2022

Enable (smaller) partial reading of indexed wig-like files. It's probably been a programming error to provide windowFunction instead of visibilityWindow to that funciton.

A different isssue: The UI for using tsv+bgzip+tabix for numerical data on arbitrary column is quite cumbersome. I managed with track:

{
    name: this_track_name,
    url: fname,
    indexURL: fname + ".tbi",
    order: track_idx,
    type: "wig,
    format: "gwas",
    visibilityWindow: vis_window,
    columns: {
        chromosome: 1,
        position: 2,
        value: value_column
    }
}

note type="wig", format="gwas". Plain type="tsv" would be nicer.

@jrobinso
Copy link
Contributor

jrobinso commented Sep 3, 2022

Tabix indexing a wig file is unusual, the normal method is bigWig. Displaying gwas data as a wig track is also unusual. I don't understand the tsv comment, "tsv" is not a track type, I think you are confusing file format with track type.

@jrobinso
Copy link
Contributor

jrobinso commented Sep 3, 2022

Actually I don't understand the purpose of this PR, visibilityWindow for wig tracks is already supported. Could you provide a test case where this fails?

@kpalin
Copy link
Contributor Author

kpalin commented Sep 5, 2022

Maybe I just haven't found the correct format and type settings for my situation. (You're right I was confusing type and format)

In general, I want to (1) show numerical data on y-axis (either as line, bar or point graph), (2) the values are sparse (e.g. only on CpG sites) and (3) the data is in coordinate sorted bgzip compressed, tabix indexed tsv files with (4) arbitrary columns past the two to three initial columns for genomic coordinates.

At least for me, this type of data is very common output of many analysis programs (e.g. this issue arose from using R package DSS for differential methylation calling and outputting a tsv with columns chr pos mu1 mu2 diff diff.se stat phi1 phi2 pval fdr, where I want to display pval as gwas track, mu1 and mu2 maybe as line plot and diff or stat as barplot.) Importantly, these are intermediate analysis results which need to be usable for further processing so bigWig is out of question in practice.

See below for type=gwas, format=gwas, value_column=10 on top track and type=wig format=gwas, value_column=7 on the bottom track. I need the format=gwas to use different data columns from the same file (I think).

image

All that being said, the presumed bug fixed by the pull request is quite mundane: without this, the browser loads the whole chromosome worth of data even for a small window, with this, only visibilityWindowsize segment of data is loaded.

@jrobinso
Copy link
Contributor

jrobinso commented Sep 5, 2022

For the generic line plot / bargraph "bedgraph" is a reasonable format. "gwas" is a rather specialized format output by "plink" and similar programs. The format parameter is used to select a parser for the file.

@jrobinso
Copy link
Contributor

jrobinso commented Sep 5, 2022

Ahh, I see the issue your PR addresses now, this looks good.

@jrobinso jrobinso merged commit b5b261d into igvteam:master Sep 5, 2022
@kpalin
Copy link
Contributor Author

kpalin commented Sep 6, 2022

Thanks for the merge.

Sorry to use the pull request as a feature request but my format woes refer to the custom data column definition, which (I think) is only available for gwas format (Found it here. ) I don't think that's available for bedgraph or any other format.

@jrobinso
Copy link
Contributor

jrobinso commented Sep 6, 2022

You're correct, the value column in bedgraph and most other formats is already defined, there is no flexibility (otherwise its not bedgraph). In some cases, such as "seg" format, it is always the last column.

If I can reframe the request it would be support for generic tab delimited files. The syntax would be very similar to the gwas case, I don't see any way to make it less "clumsy", suggestions welcome.

@kpalin
Copy link
Contributor Author

kpalin commented Sep 7, 2022

Now I think we've got understanding. I think the gwas format syntax is fine, apart from calling it gwas. Improved features would be (1) optionally distinct start and end coordinates (currently it's only a single pos) (2) ability to name the columns for the popup display and (3) getting the header defined by the tabix index (as given by tabix -H command/option)

@jrobinso
Copy link
Contributor

jrobinso commented Sep 8, 2022

@kpalin If you want to open an issue for these suggestions it will lessen the chance they are forgotten. I did not invent the "gwas" format its a well known format by that name so I think that will remain as-is, the generalized form of this will not rely on that format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants