Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnowballStemmer() full text not working #2

Open
afiqmuzaffar opened this issue Jun 15, 2021 · 1 comment
Open

SnowballStemmer() full text not working #2

afiqmuzaffar opened this issue Jun 15, 2021 · 1 comment

Comments

@afiqmuzaffar
Copy link

Hi @japborst, to be honest I'm new to Dart.
In this code, I'm trying to make one full sentence to be stem for each word.
but I encountered it, if full sentence its only stem the last word.

How can I achieve for full sentence stemming, am I missing something here?

class data{
  
  var result1;
  var result2;
  var result3;
  
  

  data({
    this.result1,
    this.result2,
    this.result3,
  }
  );

}

class data_db{
  static List<data> d_db =[
    data(
      result1 : 'Kicking running JumPing',
      result2 : 'Running',
      result3 : 'Jumping'
    ),
  ];
}


import 'package:simple_application/SnowballStemmer.dart';
import 'package:simple_application/data.dart';

String data1 ="";
String data2 = "";
String data3 = "";

int index = 0;
void main(List<String> arguments) {

SnowballStemmer stemmer1 = SnowballStemmer();
data1 =stemmer1.stem(data_db.d_db[index].result1);
data2 =stemmer1.stem(data_db.d_db[index].result2);
data3 =stemmer1.stem(data_db.d_db[index].result3);
print(data1);
print(data2);
print(data3);

}

DEBUG CONSOLE

Connecting to VM Service at http://127.0.0.1:44881/Cu9X-H9iBW4=/
kicking running jump
run
jump
Exited
@japborst
Copy link
Owner

Hey @afiqmuzaffar. That's currently the same behaviour as in NLTK, which this is a dart port of. See https://github.com/nltk/nltk

Now, you might still want to stem a full sentence. In that case, you can either use a for-loop or a simple split, map and join.

For example

PorterStemmer stemmer = PorterStemmer();

String sentence = "Kicking running JumPing";
print(sentence
    .split(" ")
    .map((s) => stemmer.stem(s))
    .join(" "));
// prints: kick run jump

Now that does lose the uppercasing. I see the original NLTK implementation added a parameter for that, so added that and released a new version (v2.2)

Now you can do this as well:

PorterStemmer stemmer = PorterStemmer();

String sentence = "Kicking running JumPing";
print(sentence
    .split(" ")
    .map((s) => stemmer.stem(s, toLowerCase: false))
    .join(" "));
// prints: Kick run JumP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants