Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Cluster labels #14

Open
germanb27 opened this issue Nov 4, 2024 · 1 comment
Open

Getting Cluster labels #14

germanb27 opened this issue Nov 4, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@germanb27
Copy link

Description of feature

I was trying to get the cluster assigned to each cell but I think this feature is either not available or not easily accesible (I have tried with BaseClusterEstimator but can't do it). I wonder whether it could be implemented in order to add a column to my dataframe with the cluster.

@germanb27 germanb27 added the enhancement New feature or request label Nov 4, 2024
@berombau
Copy link
Member

Hi,

A complete example with the FlowSOM object would be something like this. The labels are in the column metaclustering:

In [1]: # Import the FlowSOM package
   ...: import flowsom as fs
   ...: 
   ...: # Load the FCS file
   ...: ff = fs.io.read_FCS("./tests/data/ff.fcs")
   ...: 
   ...: # Run the FlowSOM algorithm
   ...: fsom = fs.FlowSOM(
   ...:     ff, cols_to_use=[8, 11, 13, 14, 15, 16, 17], xdim=10, ydim=10, n_clusters=10, seed=42
   ...: )

In [8]: fsom.get_cell_data().obs
Out[8]: 
       clustering  distance_to_bmu  metaclustering
0              91         0.511422               0
1              95         1.114001               2
2              16         1.093650               1
3              55         0.953404               7
4              49         0.688398               3
...           ...              ...             ...
19220          46         0.675527               3
19221          67         0.882528               3
19222          11         0.476845               0
19223           0         0.573982               0
19224          61         0.593200               0

[19225 rows x 3 columns]

For the functional AnnData workflow with flowsom_clustering, it looks something like this. The labels are in the column FlowSOM_metaclusters.

In [10]: ff_clustered = fs.flowsom_clustering(ff, cols_to_use=[8, 11, 13, 14, 15, 16, 17], xdim=10, ydim=10, n_clusters=10, seed=42)
    ...: ff_clustered
2024-11-25 11:30:44.034 | DEBUG    | flowsom.main:__init__:84 - Reading input.
2024-11-25 11:30:44.035 | DEBUG    | flowsom.main:__init__:86 - Fitting model: clustering and metaclustering.
2024-11-25 11:30:44.194 | DEBUG    | flowsom.main:__init__:88 - Updating derived values.
Out[10]: 
AnnData object with n_obs × n_vars = 19225 × 18
    obs: 'clustering', 'distance_to_bmu', 'metaclustering', 'FlowSOM_clusters', 'FlowSOM_metaclusters'
    var: 'n', 'channel', 'marker', '$PnB', '$PnE', '$PnG', '$PnR', '$PnV', 'pretty_colnames', 'markers', 'channels', 'cols_used'
    uns: 'meta', 'n_nodes', 'n_metaclusters', 'FlowSOM'

In [11]: ff_clustered.obs
Out[11]: 
       clustering  distance_to_bmu  metaclustering  FlowSOM_clusters  FlowSOM_metaclusters
0              91         0.511422               0                91                     0
1              95         1.114001               2                95                     2
2              16         1.093650               1                16                     1
3              55         0.953404               7                55                     7
4              49         0.688398               3                49                     3
...           ...              ...             ...               ...                   ...
19220          46         0.675527               3                46                     3
19221          67         0.882528               3                67                     3
19222          11         0.476845               0                11                     0
19223           0         0.573982               0                 0                     0
19224          61         0.593200               0                61                     0

[19225 rows x 5 columns]

Should the .obs functionality be unclear, see the AnnData documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants