Skip to content

Commit

Permalink
Merge branch 'release/4.0.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
paulirwin committed Mar 9, 2021
2 parents 17ce434 + f9c152c commit b1b6399
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 7 deletions.
29 changes: 28 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,34 @@ This algorithm is usually used for optical character recognition (OCR) applicati

It can also be used for keyboard typing auto-correction. Here the cost of substituting E and R is lower for example because these are located next to each other on an AZERTY or QWERTY keyboard. Hence the probability that the user mistyped the characters is higher.

<!-- TODO.JB - port Java example code -->
```cs
using System;
using F23.StringSimilarity;

public class Program
{
public static void Main(string[] args)
{
var l = new WeightedLevenshtein(new ExampleCharSub());

Console.WriteLine(l.Distance("String1", "String1"));
Console.WriteLine(l.Distance("String1", "Srring1"));
Console.WriteLine(l.Distance("String1", "Srring2"));
}
}

private class ExampleCharSub : ICharacterSubstitution
{
public double Cost(char c1, char c2)
{
// The cost for substituting 't' and 'r' is considered smaller as these 2 are located next to each other on a keyboard
if (c1 == 't' && c2 == 'r') return 0.5;

// For most cases, the cost of substituting 2 characters is 1.0
return 1.0;
}
}
```

## Damerau-Levenshtein
Similar to Levenshtein, Damerau-Levenshtein distance with transposition (also sometimes calls unrestricted Damerau-Levenshtein distance) is the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a **transposition of two adjacent characters**.
Expand Down
22 changes: 17 additions & 5 deletions src/F23.StringSimilarity/F23.StringSimilarity.csproj
Original file line number Diff line number Diff line change
@@ -1,23 +1,35 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netstandard1.0</TargetFramework>
<TargetFrameworks>netstandard2.0;net45</TargetFrameworks>
<PackageId>F23.StringSimilarity</PackageId>
<PackageVersion>3.1.0</PackageVersion>
<PackageVersion>4.0.0</PackageVersion>
<PackageTags>string;similarity;distance;levenshtein;jaro-winkler;lcs;cosine</PackageTags>
<Title>StringSimilarity.NET</Title>
<Authors>James Blair, Paul Irwin</Authors>
<Copyright>Copyright 2018 feature[23]</Copyright>
<Description>A .NET port of java-string-similarity.</Description>
<Summary>A .NET port of java-string-similarity (https://github.com/tdebatty/java-string-similarity). A library implementing different string similarity and distance measures. Several algorithms (including Levenshtein edit distance and sibblings, Jaro-Winkler, Longest Common Subsequence, cosine similarity etc.) are currently implemented.</Summary>
<PackageProjectUrl>https://github.com/feature23/StringSimilarity.NET</PackageProjectUrl>
<PackageLicenseUrl>https://raw.githubusercontent.com/feature23/StringSimilarity.NET/master/LICENSE</PackageLicenseUrl>
<PackageIconUrl>https://raw.githubusercontent.com/feature23/StringSimilarity.NET/master/logo.png</PackageIconUrl>
<PackageLicenseExpression>MIT</PackageLicenseExpression>
<PackageIcon>logo.png</PackageIcon>
<PackageRequireLicenseAcceptance>false</PackageRequireLicenseAcceptance>
<PackageTags>string similarity distance cosine damerau jaccard jaro-winkler levenshtein ngram qgram shingle sift4</PackageTags>
<PublishRepositoryUrl>true</PublishRepositoryUrl>
<IncludeSymbols>true</IncludeSymbols>
<SymbolPackageFormat>snupkg</SymbolPackageFormat>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.SourceLink.GitHub" Version="1.0.0" PrivateAssets="All" />
</ItemGroup>

<ItemGroup>
<None Include="logo.png" Pack="true" PackagePath="\" />
</ItemGroup>

<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|AnyCPU'">
<DocumentationFile>bin\Release\netstandard1.0\F23.StringSimilarity.xml</DocumentationFile>
<DocumentationFile>bin\Release\netstandard2.0\F23.StringSimilarity.xml</DocumentationFile>
</PropertyGroup>

</Project>
File renamed without changes
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp2.0</TargetFramework>
<TargetFramework>netcoreapp3.1</TargetFramework>
</PropertyGroup>

<ItemGroup>
Expand Down

0 comments on commit b1b6399

Please sign in to comment.