Loading section...

LCS and Edit Distance

Concepts: pyLCS, pyEditDistance, pyRecordLinkage

Longest Common Subsequence and edit distance are the two most important 2D DP problems for data engineering interviews. They are not just algorithm exercises. They are the foundation of diff tools, schema reconciliation, log pattern matching, and record linkage. If you can implement both from scratch and explain the DE applications, you are answering at the senior level. Longest Common Subsequence (LeetCode 1143) Given two strings s1 and s2, find the length of their longest common subsequence (not necessarily contiguous). dp[i][j] = LCS length for s1[:i] and s2[:j]. If s1[i-1] == s2[j-1]: characters match, extend the LCS: dp[i][j] = dp[i-1][j-1] + 1. If they differ: skip one character from either string and take the better result: dp[i][j] = max(dp[i-1][j], dp[i][j-1]). Edit Distance (Leet