LCS
LCS key1 key2 [LEN] [IDX] [MINMATCHLEN min-match-len] [WITHMATCHLEN]
- Available since:
- Redis Open Source 7.0.0
- Time complexity:
- O(N*M) where N and M are the lengths of s1 and s2, respectively
- ACL categories:
-
@read,@string,@slow, - Compatibility:
- Redis Software and Redis Cloud compatibility
The LCS command implements the longest common subsequence algorithm. Note that this is different than the longest common string algorithm, since matching characters in the string does not need to be contiguous.
For example, the LCS between "foo" and "fao" is "fo", since scanning the two strings from left to right, the longest common set of characters is composed of the first "f" and then the "o".
LCS is very useful in order to evaluate how similar two strings are. Strings can represent many things. For instance if two strings are DNA sequences, the LCS will provide a measure of similarity between the two DNA sequences. If the strings represent some text edited by some user, the LCS could represent how different the new text is compared to the old one, and so forth.
Note that this algorithm runs in O(N*M) time, where N is the length of the first string and M is the length of the second string. So either use a separate Redis instance to run this algorithm, or make sure to run it against very small strings.
Required arguments
key1
The name of the first key.
key2
The name of the second key.
Optional arguments
LEN
Return the length of the longest common subsequence instead of the subsequence itself.
IDX
Return the positions of the matches instead of the matching string.
MINMATCHLEN min-match-len
When used with IDX, only return matches at least min-match-len characters long.
WITHMATCHLEN
When used with IDX, include the length of each match in the result.
Examples
> MSET key1 ohmytext key2 mynewtext
OK
> LCS key1 key2
"mytext"
Sometimes you need just the length of the match:
> LCS key1 key2 LEN
(integer) 6
However what is often very useful is to know the match position in each string:
> LCS key1 key2 IDX
1) "matches"
2) 1) 1) 1) (integer) 4
2) (integer) 7
2) 1) (integer) 5
2) (integer) 8
2) 1) 1) (integer) 2
2) (integer) 3
2) 1) (integer) 0
2) (integer) 1
3) "len"
4) (integer) 6
Matches are produced from the last one to the first one, since this is how the algorithm works, and it more efficient to emit the matches in the same order. The above array means that the first match (second element of the array) is between positions 2-3 of the first string and 0-1 of the second. Then there is another match between 4-7 and 5-8.
To restrict the list of matches to the ones of a given minimal length:
> LCS key1 key2 IDX MINMATCHLEN 4
1) "matches"
2) 1) 1) 1) (integer) 4
2) (integer) 7
2) 1) (integer) 5
2) (integer) 8
3) "len"
4) (integer) 6
Finally, you may want to get the match length as well:
> LCS key1 key2 IDX MINMATCHLEN 4 WITHMATCHLEN
1) "matches"
2) 1) 1) 1) (integer) 4
2) (integer) 7
2) 1) (integer) 5
2) (integer) 8
3) (integer) 4
3) "len"
4) (integer) 6
Redis Software and Redis Cloud compatibility
| Redis Software |
Redis Cloud |
Notes |
|---|---|---|
| ✅ Standard |
✅ Standard |
Return information
One of the following:
- Bulk string reply: the longest common subsequence.
- Integer reply: the length of the longest common subsequence when LEN is given.
- Array reply: an array with the LCS length and all the ranges in both the strings when IDX is given.