Skip to content

Commit

Permalink
Turkish: Document proper noun suffix removal
Browse files Browse the repository at this point in the history
  • Loading branch information
ojwb committed Oct 12, 2024
1 parent 25b8515 commit d79c3fc
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions algorithms/turkish/stemmer.tt
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,15 @@ Conference
</p>
</DL>

<p>
In addition to the steps described in the paper, as a first step if the
input contains an apostrophe we truncate it at the first apostrophe, which
aims to remove proper noun suffixes. For example, <i>Türkiye'dir</i> ("it is
Turkey") is truncate to <i>Türkiye</i> ("Turkey") which is then stemmed as
it would be without the proper noun suffix. (This step was added in Snowball
2.3.0.)
</p>

<h2>The algorithm in Snowball</h2>

[% highlight_file('turkish') %]
Expand Down

0 comments on commit d79c3fc

Please sign in to comment.