Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Stacked Bar Chart and sorting to Plots Tab #4960

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

SURAJ-SHARMA27
Copy link
Contributor

@SURAJ-SHARMA27 SURAJ-SHARMA27 commented Aug 6, 2024

Describe changes proposed in this pull request:
Addition of Sort By MinorCategory Options:

  • Implemented a new feature allowing users to sort by minorCategory.
  • Added a dropdown menu with minorCategory options for sorting.

Sorted Stacks by Selected MinorCategory:

  • For the selected option from the dropdown, entities will appear at the beginning of each stack in a sorted order with respect to the count of chosen minorCategory.

Any screenshots or GIFs?

screen-capture.9.webm

Notify reviewers

@alisman
@sowmiyaa-kumar
@zeynepkaragoz
@TJMKuijpers
@inodb

Copy link

netlify bot commented Aug 6, 2024

Deploy Preview for cbioportalfrontend ready!

Name Link
🔨 Latest commit bc6e5b9
🔍 Latest deploy log https://app.netlify.com/sites/cbioportalfrontend/deploys/66bb9d27cde1bf0008050021
😎 Deploy Preview https://deploy-preview-4960--cbioportalfrontend.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@inodb inodb requested review from gblaih and alisman August 7, 2024 13:34
@inodb inodb added the gsoc label Aug 7, 2024
@inodb inodb changed the title sorting-functionality-added@stackedBarChart at plot tab Add Stacked Bar Chart and sorting functionality Aug 7, 2024
@inodb inodb added the feature label Aug 7, 2024
@inodb inodb changed the title Add Stacked Bar Chart and sorting functionality Add Stacked Bar Chart and sorting to Plots Tab Aug 7, 2024
@inodb
Copy link
Member

inodb commented Aug 7, 2024

@SURAJ-SHARMA27 this looks awesome - thank you!

For the sorting, can we also sort by "# samples"? The value of the Y-axis basically

image

This would be similar to e.g. how it works here for treatment response:
image

This is slightly separate, but maybe for treatment response, the sorting should be using your new "Sort By" component too

@schultzn
Copy link

schultzn commented Aug 7, 2024

Looks great. It would be nice to also be able to sort by the overall number of samples (I see that comment above) and also to change the sort order back to alphabetical.

@jjgao
Copy link
Member

jjgao commented Aug 7, 2024

Good work @SURAJ-SHARMA27 !

  • Somehow x-axis labels does not change - always alphabetically, but the bars are changing

image

  • When switch Plot Type, should we change the sort value, ie. sort by number of samples if it's "Stacked bar chart" and by % samples if it's "100% stacked bar chart"?

  • Should the 'Sort By' menu be included in the 'Vertical Axis' section since it's using the values from vertical axis.

  • It would be nice to add the sort by parameter to the url.

@SURAJ-SHARMA27
Copy link
Contributor Author

Good work @SURAJ-SHARMA27 !

  • Somehow x-axis labels does not change - always alphabetically, but the bars are changing

image

  • When switch Plot Type, should we change the sort value, ie. sort by number of samples if it's "Stacked bar chart" and by % samples if it's "100% stacked bar chart"?
  • Should the 'Sort By' menu be included in the 'Vertical Axis' section since it's using the values from vertical axis.
  • It would be nice to add the sort by parameter to the url.

Actually, I implemented that earlier, as you can see in the video I attached at the time of the PR. The labels were changing, but I forgot to include the logic when pushing it. Thank you for pointing that out. I have now made the changes and pushed it.

@SURAJ-SHARMA27
Copy link
Contributor Author

I think I have implemented most of the suggestions and added two more options for sorting: sort by the number of samples and alphabetically. If you could review it. @inodb

screen-capture.14.webm

@inodb
Copy link
Member

inodb commented Aug 8, 2024

Amazing - nice work @SURAJ-SHARMA27 !

Should the 'Sort By' menu be included in the 'Vertical Axis' section since it's using the values from vertical axis.

@jjgao Good question - I kinda like it separate from x/y axis selection since we might also want to allow sorting by multiple values in the future (which could be like a multi-select dropdown element)

Few more thoughts:

  • Should we rename # samples to Number of Samples? I like the consistency with the current y-axis and we use # in several places, but it's maybe not as obvious in the sort selection what # means?
  • Can we make Alphabetical show up as the default in the sort by? It currently shows an empty selection on first load:
    image

@@ -58,6 +58,11 @@ export interface IMultipleCategoryBarPlotProps {
svgRef?: (svgContainer: SVGElement | null) => void;
pValue: number | null;
qValue: number | null;
SortByDropDownOptions?: { value: string; label: string }[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sortByDropDownOptions

@@ -425,6 +430,43 @@ export default class MultipleCategoryBarPlot extends React.Component<

@computed get labels() {
if (this.data.length > 0) {
if (this.props.sortByOption == 'SortByTotalSum') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

===

);
return sortedMajorCategories;
} else if (
this.props.sortByOption != '' &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use === and !== throughout the code

Comment on lines 5436 to 5441
this?.SortByDropDownOptions
}
updateDropDownOptions={
this?.updateDropDownOptions
}
sortByOption={this?.sortByOption}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this?. necessary? i'd think this. is fine

Comment on lines 433 to 436
if (this.props.sortByOption == 'SortByTotalSum') {
const majorCategoryCounts: any = {};

this.data.forEach(item => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code looks pretty similar to the the function sortDataByOption you made below. it would be nice to extract this into a function that can be reused

(a, b) => majorCategoryCounts[b] - majorCategoryCounts[a]
);

const reorderCounts = (counts: any) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of making this function inside, lets extract this into a new function outside for clarity and reuse

@SURAJ-SHARMA27
Copy link
Contributor Author

I have implemented all the changes as suggested. You can review them. I am also attaching a video for reference.
@inodb @gblaih

screen-capture.15.webm

@inodb
Copy link
Member

inodb commented Aug 13, 2024

@SURAJ-SHARMA27 Thanks for the fixes! I noticed the "Sort by" option is currently missing here, do you see that?

image

@@ -435,6 +451,18 @@ export default class MultipleCategoryBarPlot extends React.Component<
}
}

private setInitialSelectedOption = () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit of an anti-pattern. calling from the child component to update the state of parent component on render. it should be possible to do this in parent component prior to mount?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the parent component, it doesn't contain data in the form of IMultipleCategoryBarPlotData, from which I have to extract all the unique major categories for dropdown options. Therefore, to avoid extra calculations, I passed the prop from the parent and updated it when it became available in the child.

@@ -98,6 +98,71 @@ export function sortDataByCategory<D>(
}
});
}
export function getSortedMajorCategories(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a comment explaining what is meant by "Major Category" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

The data is defined in a way that it contains x-axis labels referred to as major categories. This functionality is used twice: once during the sorting of data and again when the getLabels function in MultipleCategoryBarPlot.tsx is called. Therefore, I created a separate function for that to enhance reusability.

sortByOption: string | undefined
): string[] {
if (sortByOption === 'SortByTotalSum') {
const majorCategoryCounts: any = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid 'any'. lets type this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have defined the types across all the changes.

if (sortByOption === 'SortByTotalSum') {
const majorCategoryCounts: any = {};

data.forEach(item => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can do this operation with flatMap, which will avoid nesting forEach (which is confusing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above function is used to find the sum of each majorCategory for different minorCategories. For instance, consider the following example:

data contains three minorCategory names: Biopsy, Resection, and Cusa. Each of these minorCategories contains a large array named counts, and each element in that array is an object containing majorCategory, count, and percentage. For example, for Biopsy, the majorCategories might include Germ Cell Tumor and Thyroid Cancer....

I am maintaining a totalSum of these majorCategories and then applying sorting.

If I use flatMap, it would first be difficult to determine which majorCategory belongs to which minorCategory. Additionally, I think it would require more space and have an equivalent time complexity of O(n * m) as it does now with the two nested loops. First, I would need to perform the flatMap operation (a map operation followed by a flattening operation, essentially equivalent to a nested loop) and then use reduce to accumulate the counts. What are your suggestions should i follow another approach?


return sortedEntityData.counts.map(item => item.majorCategory);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just complete the else here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes completed

data: IMultipleCategoryBarPlotData[],
sortByOption: string | undefined
): string[] {
if (sortByOption === 'SortByTotalSum') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see if SortByTotalSum is available on an enum somewhere. seems likely that it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was not there; I have defined an enumeration and used it

return Object.keys(majorCategoryCounts).sort(
(a, b) => majorCategoryCounts[b] - majorCategoryCounts[a]
);
} else if (sortByOption !== '' && sortByOption !== 'alphabetically') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use constant for 'alphabetically' or see if it's available on an enum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have defined an enumeration and used it

const sortedMajorCategories = getSortedMajorCategories(data, sortByOption);

if (sortByOption === 'SortByTotalSum' || sortedMajorCategories.length > 0) {
const reorderCounts = (counts: any) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid 'any'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have defined the types across all the changes.

@inodb
Copy link
Member

inodb commented Aug 13, 2024

@SURAJ-SHARMA27 Just discussed this during the community call, people liked it! Some more feedback from the group:

  • When switching to percentage, the bars should sort by the percentage value (rather than absolute value)
  • On the Table: maybe disable the sort here? Or if it is easy to enable sorting there too, enable it.
  • Remove "Sort By" on charts where it is not functional (e.g. continuous vs continuous). Currently, the "Sort by" seems to show old values from a previous selection

@SURAJ-SHARMA27
Copy link
Contributor Author

@SURAJ-SHARMA27 Thanks for the fixes! I noticed the "Sort by" option is currently missing here, do you see that?

image

yes i fixed it

@SURAJ-SHARMA27
Copy link
Contributor Author

@SURAJ-SHARMA27 Just discussed this during the community call, people liked it! Some more feedback from the group:

  • When switching to percentage, the bars should sort by the percentage value (rather than absolute value)
  • On the Table: maybe disable the sort here? Or if it is easy to enable sorting there too, enable it.
  • Remove "Sort By" on charts where it is not functional (e.g. continuous vs continuous). Currently, the "Sort by" seems to show old values from a previous selection

Thank you @inodb sir, just few more doubts
i) This means that when the number of samples is selected from the dropdown, the data should be sorted by the percentage value.
ii) Sorting will be shown only for the stacked bar chart; I have fixed that.

@inodb
Copy link
Member

inodb commented Aug 21, 2024

i) This means that when the number of samples is selected from the dropdown, the data should be sorted by the percentage value.

@SURAJ-SHARMA27 correct

ii) Sorting will be shown only for the stacked bar chart; I have fixed that.

Thanks so much for fixing!

@SURAJ-SHARMA27
Copy link
Contributor Author

i) This means that when the number of samples is selected from the dropdown, the data should be sorted by the percentage value.

@SURAJ-SHARMA27 correct

ii) Sorting will be shown only for the stacked bar chart; I have fixed that.

Thanks so much for fixing!

okay sir, I think I did it correctly. You can review it (pushed already). The bars will be sorted according to the percentages of the selected minor category. If the number of samples is selected from the dropdown, the bars are sorted by the total absolute count for each stack. For alphabetical sorting, the bars will be sorted as usual.

screen-capture.33.webm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants