Levenshtein Distance

The Levenshtein Distance algorithm is a method used to measure the similarity between two strings by calculating the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other.

This algorithm is particularly useful in fields like computational linguistics, spell checking, and bioinformatics.

pseudocode:

function levenshteinDistance(s1, s2):
    // Create a 2D array to store the distances
    let distances = new Array(s1.length + 1, s2.length + 1)

    // Initialize the first row and column
    for i from 0 to s1.length:
        distances[i][0] = i
    for j from 0 to s2.length:
        distances[0][j] = j

    // Compute the distances for the rest of the matrix
    for i from 1 to s1.length:
        for j from 1 to s2.length:
            // Calculate costs for insertion, deletion, and substitution
            let substitutionCost = 0
            if s1[i - 1] ≠ s2[j - 1]:
                substitutionCost = 1
            
            // Choose the minimum of the three operations
            distances[i][j] = min(
                distances[i - 1][j] + 1,   // deletion
                distances[i][j - 1] + 1,   // insertion
                distances[i - 1][j - 1] + substitutionCost  // substitution
            )

    // Return the bottom-right value of the matrix
    return distances[s1.length][s2.length]
function levenshteinDistance(str1, str2) {
    const len1 = str1.length;
    const len2 = str2.length;

    // Create a 2D array to store the distances
    const dp = Array(len1 + 1).fill(null).map(() => Array(len2 + 1).fill(0));

    // Initialize the first row and column of the array
    for (let i = 0; i <= len1; i++) {
        dp[i][0] = i;
    }
    for (let j = 0; j <= len2; j++) {
        dp[0][j] = j;
    }

    // Fill in the rest of the array
    for (let i = 1; i <= len1; i++) {
        for (let j = 1; j <= len2; j++) {
            const cost = str1[i - 1] === str2[j - 1] ? 0 : 1;
            dp[i][j] = Math.min(
                dp[i - 1][j] + 1, // Deletion
                dp[i][j - 1] + 1, // Insertion
                dp[i - 1][j - 1] + cost // Substitution
            );
        }
    }

    // Return the final result (bottom-right corner of the array)
    return dp[len1][len2];
}

// Example usage
const string1 = "kitten";
const string2 = "sitting";
const distance = levenshteinDistance(string1, string2);
console.log(`The Levenshtein distance between "${string1}" and "${string2}" is ${distance}.`);

In this example:

  • We define a function levenshteinDistance that takes two strings, str1 and str2, as input.

  • We create a 2D array dp to store the distances between substrings of str1 and str2.

  • We initialize the first row and column of the array to represent the distances between an empty string and substrings of str1 and str2.

  • We then iterate through each character of str1 and str2, calculating the minimum cost of converting substrings of str1 to substrings of str2.

  • Finally, we return the distance between the entire strings, which is stored in the bottom-right corner of the array.

Last updated