Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to recover gene expression used for scran::pairwiseTTests() #16

Open
ewijaya opened this issue Mar 8, 2022 · 1 comment
Open

How to recover gene expression used for scran::pairwiseTTests() #16

ewijaya opened this issue Mar 8, 2022 · 1 comment

Comments

@ewijaya
Copy link

ewijaya commented Mar 8, 2022

Hi Aaron,

Thank you for making this wonderful tool.

This is not an issue but a question for scran::pairwiseTTests().
And I also know this is a scuttle repository, but I can't find one match for scran.
So I posted here.

I'm wondering if there's a way we can recover the mean gene expression
used for calculating the fold change of scran::pairwiseTTests.

I tried the following code used for calculating fold change using scran::pairwiseTTests:

library(scuttle)
library(scran)
library(tidyverse)
sce <- mockSCE()
sce <- logNormCounts(sce)


cell_groupings <- colData(sce)$Mutation_Status 
names(cell_groupings) <- rownames(sce)

out <- pairwiseTTests(scuttle::logNormCounts(sce), groups=cell_groupings)
out$statistics[[1]] %>% # negative / positive
  as.data.frame() %>%
  rownames_to_column(var = "genes") %>%
  as_tibble() %>%
  arrange(desc(logFC)) 


It produces

genes     logFC  p.value   FDR
 <chr>     <dbl>    <dbl> <dbl>
1 Gene_1740 1.20  0.000617 0.733
2 Gene_0691 0.968 0.00435  0.733
3 Gene_1755 0.932 0.00690  0.733
4 Gene_1460 0.916 0.0107   0.733
5 Gene_0894 0.890 0.0129   0.733
6 Gene_1910 0.847 0.0208   0.761
7 Gene_1839 0.844 0.0168   0.733
8 Gene_1147 0.839 0.00304  0.733
9 Gene_0477 0.832 0.0134   0.733
10 Gene_1201 0.816 0.00643  0.733

Notice that the logFC for Gene_1740 is 1.20 and Gene_0691 is 0.968.

But when I compute manually to get mean expression:

meta_data_df <- colData(sce) %>% 
  as.matrix() %>% 
  as.data.frame()   %>% 
  rownames_to_column(var = "barcodes") %>% 
  as_tibble()

scuttle::normalizeCounts(sce) %>% 
as.matrix() %>% 
as.data.frame() %>% 
rownames_to_column(var = "genes") %>% 
as_tibble() %>% 
pivot_longer(-genes, names_to = "barcodes", values_to = "gexp") %>% 
  left_join(meta_data_df, by = "barcodes") %>% 
  filter(genes %in% c("Gene_1740", "Gene_0691")) %>%
  group_by(genes, Mutation_Status) %>% 
  summarise(mean_gexp = mean(gexp)) %>% 
  pivot_wider(names_from = "Mutation_Status", values_from = "mean_gexp") %>% 
  mutate(logFC = log2(negative/positive)) %>% 
  arrange(desc(logFC))

I get

genes     negative positive logFC
<chr>        <dbl>    <dbl> <dbl>
1 Gene_1740     4.74     3.54 0.421
2 Gene_0691     4.28     3.32 0.370


Notice the difference in logFC. Is there a way I can easily recover the actual expression so that the logFC match
calculated with scran::pairwiseTTests?

Thanks and hope to hear from you again.

Edward

@LTLA
Copy link
Owner

LTLA commented Mar 9, 2022

I don't follow any of the pipes, but I will say that normalizeCounts will produce, by default, log2-normalized values. So your log-fold change calculation should not be log2(negative/positive), but instead, negative - positive, assuming that both negative and positive represent the mean log-expression value in their respective groups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants