Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Custom Lexicon from scratch #257

Closed
elderbas opened this issue Aug 12, 2016 · 9 comments
Closed

New Custom Lexicon from scratch #257

elderbas opened this issue Aug 12, 2016 · 9 comments

Comments

@elderbas
Copy link

It seems like once you load new words to the lexicon it stays there.

What's the best way to use a custom lexicon for only a temporary time?
Right now on my Node app I have to use require-reload to just re instantiate nlp every time.

@spencermountain
Copy link
Owner

yikes!
yeah, in the new api would you prefer

nlp('mytext', {lexicon:myNewWords}).whatever()

as a somehow non-persistent lexicon change? It's a little tricky, and I'm not an expert about how modules get loaded, but there's a fair amount if processing each time the lexicon gets built, so it would be a drag if it had to be rebuilt at each use. Also, the lexicon is just a hashmap, so if you change it, it's all pass-by-reference downstream.
what would be the best way?
cheers

@elderbas
Copy link
Author

I'm not an expert on how modules get loaded it either, but I got it to work now for my majority of use without 'require-reload' thankfully.

I had some unit tests that were sharing the same nlp instance and I wanted them to hit specific lexicons sometimes and others not. I decided to just instantiate the nlp instance inside the function instead and problem went away. =)

@spencermountain
Copy link
Owner

hey, your timing is perfect for making this issue.
Last night, i was like ":zap: oh, why don't we just do this?"
the passed-in lexicon is now just a second object, that we consult first. so in v7 it will not be persistent anymore. ... duh!
thanks.

@playground
Copy link

HI @spencermountain and @elderbas, I just stumbled onto nlp-compromise last night. I'm trying to understand the different between plugins and custom lexicon and how I might use them. Can you share some use cases?

Let's say if I want to help users navigate around my site by asking questions in natural language, for example: 'I want to see the readme section' or 'I want to make a reservation' then I want to be able to infer to 'mysite/readme.html' or 'mysite/reservation.html'

Thanks.

@spencermountain
Copy link
Owner

hey @playground yeah, it's not very clear right now. I've got some new documentation written but it's stuck behind a few other things. the plugins just allow for even more config than the lexicon parameter. So, here's how i'd go about doing that redirect-example:
https://runkit.com/spencermountain/5a09a7e1d6534300114154fd
cheers

@playground
Copy link

Hey @spencermountain, thanks for getting back with me. Yeah, I'm really digging nlp-compromise, it's one of the better library I have seen so far. I have posted an issue here nlp-compromise/compromise-plugin#1, I think it probably should belong to here more. I had forked and implemented some code to support questions and responses (more like commands and controls) via the plugin. I would love to be able to contribute. But before I do that I would like to spend a little time chatting with you to make sure I'm doing things in the right orders.

@spencermountain
Copy link
Owner

hey, yeah just getting back to all my emails ;)

i think what happened in your example is that the .match() syntax isn't smart enough to handle the recursive ( instruction(s)). 😓

but yeah, you're on the right track with usage of the plugin, and I'd love some help, anywhere you'd like to help. The plugin scheme was finished in v11, about 2 weeks ago, so you'll be the second or third person/company to use it i think. The main motivation was to be able to compress config data, the same way the library does internally. - So it's pretty rough right now, and certainly lacking good docs, but ready for intrepid use

@playground
Copy link

playground commented Nov 13, 2017

I see. hmm, would be great if that will work :-)
"to? (readme|instruction(s)?)": "Navigation"
or
"to? (weekend|today|tonight|current|tomorrow) (forecast|weather condition|condition)": "Navigation"

On the commands and controls side, I'm still playing around with it, here is what I have added to provide responses based on what is described in the plugin. For example:

  "responses": {
    "Navigation": {
      "readme|instruction": "https://mydomain/readme.html",
      "calendar|appointment": "https://mydomain/calendar.html",
      "direction|address": "https://mydomain/map.html"
    },
    "Questions": {
      "what are you hours?|what time do you open?": "Our hours of operations are from 8am - 5pm",
      "do you open on (saturday|sunday|...)": "Provide whatever answers..."
    }
  }

in world/index.js

World.prototype.plugin = function(obj) {
let addResponses = require('./addResponses');
...
  if (obj.responses) {
    this.addResponses(obj.responses);
  }
};

addResponses.js

const addResponses = function(obj) {
  Object.keys(obj).forEach((k1) => {
    this.responses[k1] = [];
    Object.keys(obj[k1]).forEach((k2) => {
      this.responses[k1].push({
        reg: new RegExp(k2, 'i'),
        res: obj[k1][k2]
      });
    });
  });

  this.responses.find = (type, list = []) => {
    console.log(this.responses);
    const tag = this.responses.hasTagOf(type, list);
    let answer = '';
    if(tag.length > 0) {
      const res = this.responses[type];
      if(res) {
        res.some((r) => {
          tag.some((t) => {
            if(r.reg.test(t)) {
              answer = r.res;
              return true;
            }
          });
          if(answer) {
            return true;
          }
        });
      }
    }
    return answer;
  };

  this.responses.hasTagOf = (type, list = []) => {
    let tag = [];
    list.forEach((ls) => {
      ls.terms.forEach((ts) => {
        if(ts.tags[type]) {
          tag.push(ts.normal);
        }
      })
    });
    return tag;
  };
};
module.exports = addResponses;

@spencermountain
Copy link
Owner

hey, in v12 there's a more sensible way to configure per-parse lexicon information directly.
Every doc has a world object, which can be set directly.
It may take some fussing-with, but for my purposes i've been able to avoid nlp.clone() and require-reload, by using our new plugin setup
cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants