Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Longest Common Subsequence #315

Merged
merged 77 commits into from
Jan 19, 2021
Merged

Adding Longest Common Subsequence #315

merged 77 commits into from
Jan 19, 2021

Conversation

Arvind-raj06
Copy link
Member

@Arvind-raj06 Arvind-raj06 commented Jan 16, 2021

Longest Common Subsequence

References to other Issues or PRs or Relevant literature

"Fixes #308". See
#308

Brief description of what is fixed or changed

The Longest Common Subsequence algorithm is added in the linear data structure with reference to the site https://www.geeksforgeeks.org/longest-common-subsequence-dp-4/

Other comments

Hope this works fine and if there is any issue please ping me. This is also part of SWoC contribution


.. [1] https://en.wikipedia.org/wiki/Longest_common_subsequence_problem
"""
if not(isinstance(seq1, (str, tuple, list))) or not(isinstance(seq2, (str, tuple, list))):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Break it into two. As did earlier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep are there any changes needed

Comment on lines 759 to 760
"""
if not(isinstance(seq1, (str, tuple, list))):
Copy link

@sidhu1012 sidhu1012 Jan 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A blank line after 759 and it isn't necessary to use brackets with not.
It should be
if not isinstance (...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just find all the issue and report me, as Gagandeep asked me to reduce the commits to reduce the resource used by travis

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change this for now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just find all the issue and report me, as Gagandeep asked me to reduce the commits to reduce the resource used by travis

Issues can only be reported when they arise. Can't report them before their occurance.

Commits can be squashed into one so no worries.

Copy link

@sidhu1012 sidhu1012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sidhu1012
Copy link

Squash commits if necessary.
Ping @czgdp1807

Comment on lines 114 to 116
I, J = ['O', 'V', 'A', 'L'], ['F', 'O', 'R', 'V', 'A', 'E', 'W']
output = longest_common_subsequence(I, J)
assert expected_result == output

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add one test case for tuple too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure that can be done

Comment on lines 734 to 735
seq1: String or List or Tuple
seq2: String or List or Tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
seq1: String or List or Tuple
seq2: String or List or Tuple
seq1: Any 1D data structure that can be indexed (like list, tuple, string)
seq2: Any 1D data structure that can be indexed (like list, tuple, string)

Comment on lines 740 to 742
output: tuple
(Length of LCS, Common Sequence)
Common Sequence will be of the same data type as seq1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output: tuple
(Length of LCS, Common Sequence)
Common Sequence will be of the same data type as seq1.
output: tuple
The first element of the tuple represents the length of longest common subsequence and
the second element is the longest common subsequence itself.
Common subsequence will be of the same data type as that of input sequences.

Comment on lines 760 to 763
if not isinstance(seq1, (str, tuple, list)):
raise TypeError("Only Strings, Tuple and List are allowed")
if not isinstance(seq2, (str, tuple, list)):
raise TypeError("Only Strings, Tuple and List are allowed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not isinstance(seq1, (str, tuple, list)):
raise TypeError("Only Strings, Tuple and List are allowed")
if not isinstance(seq2, (str, tuple, list)):
raise TypeError("Only Strings, Tuple and List are allowed")

raise TypeError("Only Strings, Tuple and List are allowed")

row, col = len(seq1), len(seq2)
check_mat = [[0 for _ in range(col+1)] for x in range(row+1)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a nested dict. AFAICT, half of the matrix in these type of problems is untouched and wasted. dict will be at least theoretically better.

@Arvind-raj06
Copy link
Member Author

Arvind-raj06 commented Jan 17, 2021

 for i in range(row):
    check_mat[i+1]=[0 for _ in range(col+1)]
    for j in range(col):
           if (seq1[i] == seq2[j]):
                check_mat[i+1][j+1] = check_mat[i][j]+1
            else:
                check_mat[i+1][j+1] = max(check_mat[i+1][j], check_mat[i][j+1])

I tried it in nested dict but at the else part in comparison had the index not present in dictionary. So I had to implement them in list within dictionary

Comment on lines 782 to 786
if(type(seq1) == str):
lcseq = ''.join(lcseq)
if(type(seq1) == tuple):
lcseq = tuple(lcseq)
return (lclen, lcseq[::-1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is cryptic. We should keep the input types restricted to OneDimensionalArray only, otherwise such things will create problems while porting the code to statically typed languages like C++.
In addition, applying longest common subseqeunce on strings would be confusing because there is already something called longest common substring.
Hence the final API should be, accept two OneDimensionalArray objects and return a OneDimensionalArray.

P.S. That is why doing some background lookups are preferred for discussing APIs rather than just directly coding out things and keep changing frequently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah next time onwards we discuss and start the implementation and I will try to implement the above in One dimensional

@czgdp1807
Copy link
Member

See, https://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Code_optimization and include the ideas in the page in your code.

@Arvind-raj06
Copy link
Member Author

See, https://en.wikipedia.org/wiki/Longest_common_subsequence_problem#Code_optimization and include the ideas in the page in your code.

Yeah got that

@Arvind-raj06
Copy link
Member Author

If I had to implement them as in the Wikipedia, the left out comparison had to be added manually that would look cryptic. But still the comparison time complexity is reduced rather than before and for space complexity if I reduce them the time gets increased as it has to be in recursive.

Screenshot (197)

You can preview it here in the implementation

@Arvind-raj06
Copy link
Member Author

@czgdp1807 Label for the Swoc and be attached here and also for the #312 for making the counting easier

@czgdp1807 czgdp1807 merged commit 4e0372d into codezonediitj:master Jan 19, 2021
@Arvind-raj06 Arvind-raj06 deleted the Don'tstop branch January 23, 2021 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Longest common subsequence
3 participants