Edit distance and Levenshtein distance are nonparametric distance measures that not like well known metric distance measures such as Euclidean or Mahalanobis distances in some persfectives.
Levenshtein distance is a measure of how many characters should be replaced or moved to get two strings same.
In the example below, a string text is asked from the user in console mode. Then the input string is compared to colour names defined in R. Similar colour names are then reported:
user.string <- readline("Enter a word: ")
wordlist <- colours()
dists <- adist(user.string, wordlist)
mindist <- min(dists)
best.ones <- which(dists == mindist)
for (index in best.ones){
cat("Did you mean: ", wordlist[index],"\n")
}
Here is the results:
Enter a word: turtoise
Did you mean: turquoise
Enter a word: turtle
Did you mean: purple
Enter a word: night blue
Did you mean: lightblue
Enter a word: parliament
Did you mean: darkmagenta
Enter a word: marooon
Did you mean: maroon
Have a nice read
No comments:
Post a Comment
Thanks