Here is the algorithm from the Wikipedia page on Levenshtein distances:
int LevenshteinDistance(char s[1..m], char t[1..n])
{
// d is a table with m+1 rows and n+1 columns
declare int d[0..m, 0..n]
for i from 0 to m
d[i, 0] := i // deletion
for j from 0 to n
d[0, j] := j // insertion
for j from 1 to n
{
for i from 1 to m
{
if s[i] = t[j] then
d[i, j] := d[i-1, j-1]
else
d[i, j] := minimum
(
d[i-1, j] + 1, // deletion
d[i, j-1] + 1, // insertion
d[i-1, j-1] + 1 // substitution
)
}
}
return d[m, n]
}
(I m sure you can make java out of that with a little work.)
pass in your two DNA sequences as s
and t
and it will return the distance as an int.