Skip to content

Latest commit

 

History

History
77 lines (52 loc) · 2.34 KB

README.md

File metadata and controls

77 lines (52 loc) · 2.34 KB

#german-stemmer Build Status

A German stemmer implementation in PHP that takes a word and reduces it to its German stem using the Porter stemmer algorithm, see:

The original code was written by Aris Buzachis (original repo). Modifications I made include:

  • Switch to mb_* string functions
  • Namespaces, PSR-4 autoloading and composer setup
  • PHPUnit test setup to include original test set on Porter stemmer website
  • Travis setup
  • Semantic versioning

##Basic Usage

$word = "vergnüglich";
echo "$word => ".GermanStemmer::stem($word);

Output

vergnüglich => vergnug

###Caution Make sure to use UTF-8 as character encoding, otherwise GermanStemmer::stem() might throw an InvalidArgumentException.

//set encoding to utf-8
mb_internal_encoding("utf-8");

###Examples

See examples folder.

##Requirements

  • PHP >= 5.5

##Installation

The recommended way to install german-stemmer is through Composer.

curl -sS https://getcomposer.org/installer | php

Next, update your project's composer.json file to include GermanStemmer:

{
    "repositories": [ { "type": "composer", "url": "http://packages.myseosolution.de/"} ],
    "minimum-stability": "dev",
    "require": {
         "paslandau/german-stemmer": "dev-master"
    }
    "config": {
        "secure-http": false
    }
}

Caution: You need to explicitly set "secure-http": false in order to access http://packages.myseosolution.de/ as repository. This change is required because composer changed the default setting for secure-http to true at the end of february 2016.

After installing, you need to require Composer's autoloader:

require 'vendor/autoload.php';

##Frequently searched questions

  • How can I get the stem of a German word?
  • Where can I find an open source German PHP stemmer?