Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Longest Common Subsequence #315

Merged
merged 77 commits into from
Jan 19, 2021
Merged
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
5b34a79
Update queue.py
Arvind-raj06 Jan 7, 2021
52f01c3
Update linked_lists.py
Arvind-raj06 Jan 8, 2021
6610a7f
Update linked_lists.py
Arvind-raj06 Jan 8, 2021
4c5c855
Update linked_lists.py
Arvind-raj06 Jan 8, 2021
f93008a
Update queue.py
Arvind-raj06 Jan 8, 2021
6867add
Update linked_lists.py
Arvind-raj06 Jan 8, 2021
628045c
Update linked_lists.py
Arvind-raj06 Jan 8, 2021
09e5e42
Completed updating the insert_after
Arvind-raj06 Jan 8, 2021
74009e6
Update linked_lists.py
Arvind-raj06 Jan 8, 2021
0055baa
Update queue.py
Arvind-raj06 Jan 8, 2021
9dec38c
Update linked_lists.py
Arvind-raj06 Jan 10, 2021
708f6bc
Cocktail
Arvind-raj06 Jan 10, 2021
27a5f1a
Update algorithms.py
Arvind-raj06 Jan 10, 2021
6ae5a0c
Implementing the cocktail sort
Arvind-raj06 Jan 10, 2021
aaf529a
Update __init__.py
Arvind-raj06 Jan 10, 2021
9f937d4
Correcting error
Arvind-raj06 Jan 10, 2021
38ebcbe
Completion
Arvind-raj06 Jan 10, 2021
9a77768
Converting to ODA
Arvind-raj06 Jan 10, 2021
40350b4
Update algorithms.py
Arvind-raj06 Jan 10, 2021
a049c11
Update algorithms.py
Arvind-raj06 Jan 10, 2021
a990d46
Really!
Arvind-raj06 Jan 10, 2021
e49d268
Including cocktail sort
Arvind-raj06 Jan 11, 2021
c6c5fdd
Correcting for doda
Arvind-raj06 Jan 11, 2021
db70e68
Error Correction
Arvind-raj06 Jan 11, 2021
9acbebf
Yep done!
Arvind-raj06 Jan 11, 2021
57d7fbf
Hope this works fine
Arvind-raj06 Jan 11, 2021
3569652
Update algorithms.py
Arvind-raj06 Jan 11, 2021
c9d4f9c
Commit
Arvind-raj06 Jan 11, 2021
e1d817f
Cocktail update
Arvind-raj06 Jan 11, 2021
864e95c
Update algorithms.py
Arvind-raj06 Jan 11, 2021
570a287
Update test_algorithms.py
Arvind-raj06 Jan 12, 2021
cf91f8a
Let's check
Arvind-raj06 Jan 12, 2021
3c34257
Update test_algorithms.py
Arvind-raj06 Jan 12, 2021
1ae0e3d
Update test_algorithms.py
Arvind-raj06 Jan 12, 2021
89b903a
Fixed cocktail sort
czgdp1807 Jan 13, 2021
7a4581c
cocktail_sort -> cocktail_shaker_sort
czgdp1807 Jan 13, 2021
0c741ca
Starting with Quicksort
Arvind-raj06 Jan 13, 2021
4835dd4
Adding quick sort
Arvind-raj06 Jan 13, 2021
069613a
Making changes
Arvind-raj06 Jan 13, 2021
3f97409
Update algorithms.py
Arvind-raj06 Jan 13, 2021
2177e04
Update __init__.py
Arvind-raj06 Jan 13, 2021
7e23b41
Hope this works
Arvind-raj06 Jan 13, 2021
abe4362
Update algorithms.py
Arvind-raj06 Jan 13, 2021
0ad0b56
Update algorithms.py
Arvind-raj06 Jan 13, 2021
4e1e8d4
Ok
Arvind-raj06 Jan 13, 2021
f8afd26
Update algorithms.py
Arvind-raj06 Jan 13, 2021
377bfd6
added quick sort
Arvind-raj06 Jan 13, 2021
a1fd65a
Merge branch 'master' into Let'scode
Arvind-raj06 Jan 13, 2021
7c4d918
Update algorithms.py
Arvind-raj06 Jan 13, 2021
c042722
Removing whitespace
Arvind-raj06 Jan 13, 2021
ed7a059
Error correction
Arvind-raj06 Jan 13, 2021
9cda682
Start to implement stack
Arvind-raj06 Jan 14, 2021
65d0f88
Hope this work
Arvind-raj06 Jan 14, 2021
f22a7ac
Update algorithms.py
Arvind-raj06 Jan 14, 2021
5da091e
Yep
Arvind-raj06 Jan 14, 2021
2c7afac
Forgot
Arvind-raj06 Jan 14, 2021
e8d13b0
Shifting None to end
czgdp1807 Jan 15, 2021
f7c6463
Restored test
czgdp1807 Jan 15, 2021
b476021
Apply suggestions from code review
czgdp1807 Jan 15, 2021
923b1be
Yes Done
Arvind-raj06 Jan 15, 2021
bcb678e
Pivot picking logic corrected
czgdp1807 Jan 16, 2021
a6fec40
Completed
Arvind-raj06 Jan 16, 2021
bc6cd31
Merge branch 'master' into Don'tstop
Arvind-raj06 Jan 16, 2021
96067eb
Error Correction
Arvind-raj06 Jan 16, 2021
6df16fb
Merge branch 'Don'tstop' of https://github.com/Arvind-raj06/pydatastr…
Arvind-raj06 Jan 16, 2021
a73bc25
Yep
Arvind-raj06 Jan 16, 2021
07fc3e9
Update algorithms.py
Arvind-raj06 Jan 16, 2021
86f389c
Update algorithms.py
Arvind-raj06 Jan 16, 2021
60cfdc2
Merge branch 'Don'tstop' of https://github.com/Arvind-raj06/pydatastr…
Arvind-raj06 Jan 16, 2021
8c26960
Implement
Arvind-raj06 Jan 16, 2021
699b3b2
Update algorithms.py
Arvind-raj06 Jan 16, 2021
0ae9085
Update algorithms.py
Arvind-raj06 Jan 16, 2021
90f1089
Yes added
Arvind-raj06 Jan 17, 2021
2196ec2
Update algorithms.py
Arvind-raj06 Jan 17, 2021
2766287
Implemented
Arvind-raj06 Jan 18, 2021
a85e08b
Added tests and fixed docs
czgdp1807 Jan 19, 2021
478bf66
fixed docs
czgdp1807 Jan 19, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion pydatastructs/linear_data_structures/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
counting_sort,
bucket_sort,
cocktail_shaker_sort,
quick_sort
quick_sort,
longest_common_subsequence
)
__all__.extend(algorithms.__all__)
67 changes: 66 additions & 1 deletion pydatastructs/linear_data_structures/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@
'counting_sort',
'bucket_sort',
'cocktail_shaker_sort',
'quick_sort'
'quick_sort',
'longest_common_subsequence'
]

def _merge(array, sl, el, sr, er, end, comp):
Expand Down Expand Up @@ -722,3 +723,67 @@ def partition(low, high, pick_pivot_element):
array._modify(force=True)

return array

def longest_common_subsequence(seq1, seq2) -> tuple:
"""
Implements Longest Common Subsequence

Parameters
========

seq1: String or List or Tuple
seq2: String or List or Tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
seq1: String or List or Tuple
seq2: String or List or Tuple
seq1: Any 1D data structure that can be indexed (like list, tuple, string)
seq2: Any 1D data structure that can be indexed (like list, tuple, string)


Returns
=======

output: tuple
(Length of LCS, Common Sequence)
Common Sequence will be of the same data type as seq1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output: tuple
(Length of LCS, Common Sequence)
Common Sequence will be of the same data type as seq1.
output: tuple
The first element of the tuple represents the length of longest common subsequence and
the second element is the longest common subsequence itself.
Common subsequence will be of the same data type as that of input sequences.


Examples
========

>>> from pydatastructs import longest_common_subsequence as LCS
>>> LCS("ABCDEF", "ABBCDDDE")
(5, 'ABCDE')
>>> arr1 = ['A', 'P', 'P']
>>> arr2 = ['A', 'p', 'P', 'S', 'P']
>>> LCS(arr1, arr2)
(3, ['A', 'P', 'P'])

References
==========

.. [1] https://en.wikipedia.org/wiki/Longest_common_subsequence_problem
"""
if not(isinstance(seq1, (str, tuple, list))):
Copy link

@sidhu1012 sidhu1012 Jan 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A blank line after 759 and it isn't necessary to use brackets with not.
It should be
if not isinstance (...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just find all the issue and report me, as Gagandeep asked me to reduce the commits to reduce the resource used by travis

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change this for now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just find all the issue and report me, as Gagandeep asked me to reduce the commits to reduce the resource used by travis

Issues can only be reported when they arise. Can't report them before their occurance.

Commits can be squashed into one so no worries.

raise TypeError("Only Strings, Tuple and List are allowed")
if not(isinstance(seq2, (str, tuple, list))):
raise TypeError("Only Strings, Tuple and List are allowed")

row, col = len(seq1), len(seq2)
check_mat = [[0 for _ in range(col+1)] for x in range(row+1)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a nested dict. AFAICT, half of the matrix in these type of problems is untouched and wasted. dict will be at least theoretically better.

for i in range(row):
for j in range(col):
if (seq1[i] == seq2[j]):
check_mat[i+1][j+1] = check_mat[i][j]+1
else:
check_mat[i+1][j+1] = max(check_mat[i+1][j], check_mat[i][j+1])

lcseq, lclen = [], check_mat[row][col]
while(row > 0 and col > 0):
if(check_mat[row][col] == check_mat[row][col-1]):
col -= 1
elif(check_mat[row][col] == check_mat[row-1][col]):
row -= 1
else:
lcseq.append(seq1[row-1])
row -= 1
col -= 1

if(type(seq1) == str):
lcseq = ''.join(lcseq)
if(type(seq1) == tuple):
lcseq = tuple(lcseq)
return (lclen, lcseq[::-1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is cryptic. We should keep the input types restricted to OneDimensionalArray only, otherwise such things will create problems while porting the code to statically typed languages like C++.
In addition, applying longest common subseqeunce on strings would be confusing because there is already something called longest common substring.
Hence the final API should be, accept two OneDimensionalArray objects and return a OneDimensionalArray.

P.S. That is why doing some background lookups are preferred for discussing APIs rather than just directly coding out things and keep changing frequently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah next time onwards we discuss and start the implementation and I will try to implement the above in One dimensional

16 changes: 15 additions & 1 deletion pydatastructs/linear_data_structures/tests/test_algorithms.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
from pydatastructs import (
merge_sort_parallel, DynamicOneDimensionalArray,
OneDimensionalArray, brick_sort, brick_sort_parallel,
heapsort, matrix_multiply_parallel, counting_sort, bucket_sort, cocktail_shaker_sort, quick_sort)
heapsort, matrix_multiply_parallel, counting_sort, bucket_sort, cocktail_shaker_sort, quick_sort, longest_common_subsequence)


from pydatastructs.utils.raises_util import raises
import random
Expand Down Expand Up @@ -100,3 +101,16 @@ def test_matrix_multiply_parallel():
J = [[2, 1, 2], [1, 2, 1], [2, 2, 2]]
output = matrix_multiply_parallel(I, J, num_threads=1)
assert expected_result == output

def test_longest_common_sequence():
expected_result = (5, 'ASCII')

str1, str2 = 'AASCCII', 'ASSCIIII'
output = longest_common_subsequence(str1, str2)
assert expected_result == output

expected_result = (3, ['O', 'V', 'A'])

I, J = ['O', 'V', 'A', 'L'], ['F', 'O', 'R', 'V', 'A', 'E', 'W']
output = longest_common_subsequence(I, J)
assert expected_result == output

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add one test case for tuple too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure that can be done