compromise uses semver, and pushes to npm frequently
(github-releases occasionally)
- Major is considered a breaking api change,
- Minor is considered a behaviour/performance change.
- Patch is an obvious, non-controversial bugfix
.quotations()
no-longer return repeated results for nested quotes- simplify quotation tagset
.out('normal')
no longer includes quotes or trailing-possessives- improve
.debug()
on client-side
- better honorific support, add
honorifics
feature to .normalize() - elipses bugfixes
- replace unicode chars in
.normalize()
now by default acronyms().stripPeriods()
andacronyms().addPeriods()
- tag professions as
#Actor
- add more behaviours to
.normalize()
- support match-results as inputs to .match() and .not()
- support some us-state abbreviations like 'Phoeniz AZ'
- add
nouns().toPossessive()
- ngrams now remove empty-terms in contractions - fixes counting issue #476
- expose internal
sentences().isQuestion()
method .join()
as an alias for.flatten()
- slightly different behavior for wildcards in capture-groups pull/472
.possessives()
subset +#Possessive
tagging fixes- hide massive
world
output for console.log of a term
- improve quotations() method
- add .parentheses() method
- add 'nickname' support to .people()
- 'will be #Adjective' now tagged as Copula
- include adverbs in verb conjugation (more) consistently
sentences().toContinuous()
andverbs().toGerund()
- some more aliases for jquery-like methods api
- move
getPunctuation
,setPunctuation
from .sentence to main Text method - rename internal
endPunctuation
togetPunctuation
- more consistent
cardinal/ordinal
tagging for values
- add #Abbreviation tag
- add #ProperNoun tag
- fixes for noun inflection
- include old ending punctuation in a
.replace()
cmd
- almost-double the support for first-names
- changes to bestTag method
- rolls-back some aggressive JustesonKatz stuff
- better support for emdash numberRange
- 'can't' contraction bugfix
- fix for dates().toShortForm()
- add
#Multiple
Values tag, and changes to how invalid numbers like 'sixty fifteen hundred' are understood - better em-dash/en-dash support
- better conjugate implicit verbs inside contractions - "i'm", "we've"
- nouns().articles() method
- neighborhoods as #Place
- support more complex noun-phrases with JustesonKatz in
.nouns()
- support for persistent lexicon/tagset changes
addTags, addWords, addRegs, addPlurals, addConjugations
methods to extend native data-
.plugin()
method to wrap all of these into one
-
- (removal of
.packWords()
method)
- (removal of
- more
.organizations()
matches - regex-support in .match() -
nlp('it is waaaay cool').match('/aaa/').out()//'waaaay'
- improved apostrophe-s disambiguation
- support whitespace before sentence boundary
- improved QuestionWord tagging, some
.questions()
without a question-mark - phrasalVerb conjugation
- new #Activity tag for Gerunds as nouns 'walking is fun'
- change ngram params to an object
{size:int, max:int}
- implement '[]' capture-group syntax in .match()
- bring-back
map, filter, foreach and reduce
methods - set
.words()
as alias for .terms() people().firstNames()
,people().lastNames()
- split-out comma-separated adverbs
- fix for '.watch' reserved word in efrt
- improved
places()
parsing - improved
{min,max}
match syntax - new
.out('match')
method - quiet addition of .pack() and .unpack() for owen
- move internal lexicon around, to support new format in v11
- added states & provinces as #Region
- added #Comparable tag for adjectives that conjugate
- add increment/decrement/add/subtract methods to .values()
- add units(), noUnits() methods to .values()
- 'uncountable' nouns are no longer assumed to be singular
- money tag is no longer always a value
- improved tagging of
VerbPhrase
andCondition
- fixes to contractions in sentence-changes - "i'm going -> i went"
- several verb conjugation fixes
- accept Terms & Result objects in .match() and .replace()
- new
Percent
tag - lump more units in with
.values()
- .trim() method,
- adjective tagging fixes
- some new .out() methods
- fix return format of .isPlural(), so it acts like a match filter
- less-greedy date tagging & ambiguous month fixes
- cleanup & rename some
.value()
methods - change lumping behaviour of lexicon terms with multiple words
- keep more former tags after a term replace method
- new
.random()
method - new
.lessThan()
,.greaterThan()
,.equalTo()
methods - new prefix/suffix/infix matches with
_ffix
syntax tag()
supports a sequence of tags for a sequence of terms- .match 'range' queries now use a real match -
#Adverb{2,4}
- new
.before()
and.after()
match methods - removes
.lexicon()
method for many-lexicons concept - changes params of
.replaceWith()
method to a 'keyTags' boolean - improved .debug() and logging on client-side
- pretty-real filesize reduction by swapping es6 classes for es5 inheritance
- rename
Term.tag
object toTerm.tags
so the.tag()
method can work throughout more-consistently - fix 'Auxillary' tag typo to 'Auxiliary'
- optimisation of .match(), and tagset - significant speedup!
- adds
.tagger()
method and cleanup extra params - adds
wordStart
andwordEnd
offsets to.out('offset')
for whitespace+punctuation - new
.has()
method for faster lookups
- add
nlp.out('index')
method, 12 bugs
- add
nlp.tokenize()
method for disabling pos-tagging of input
- less-ambitious date-parsing of nl-date forms
- filesize reduction using efrt data structure (254k -> 214k)
- fix for IE9
- weee! big change! npm package rename
- builds now using browserify + derequire()
- re-written term-lumper logic
- new nlp.lexicon({word:'POS'}) flow
- be consistent with
text.normal()
,term.all_forms()
,text.word_count()
.text.normal()
includes sentence-terminators, like periods etc.
- airport codes support, helper methods for specific POS
- newlines split sentences
- Text methods now return this, instead of array of sentences
- more-sensible responses for invalid, non-string inputs
- 14 PRs, with fixes for currencies, pluralization, conjugation
- Value.to_text() new method, fix "Posessive" POS typo
- return of the text.spot() method (Re:#107)
- more aggressive lumping of dates, like 'last week of february'
- whitespace reproduction in .text() methods
- move negate from sentence to verb & statement
- rename 'implicit' to 'expansion' for smarter contractions
- added readable-compression to adj, verbs (121kb -> 117kb)
- hyphenated words are normalized into spaces
- grammar-aware match & replace functions
- Statement & Question classes
- split ngram, locale, and syllables into plugins in seperate repo
- es6 classes, babel building
- better test coverage
- ngram uses term tokenization, so that 'Tony Hawk' us one term, and not two
- more organized pos rules
- Pos tagging is done implicitly now once nlp.Text is run
- Entity spotting is split into .people(), .place(), .organisations()
- unicode normalisation is killed
- opaque two-letter tags are gone
- plugin support
- passive tense detection
- lexicon can be augmented third-party
- date parsing results are different
- smarter handling of ambiguous contractions ("he's" -> ["he is", "he has"])
- added name genders and beginning of co-reference resolution ('Tony' -> 'he') API.
- small breaking change on
Noun.is_plural
andNoun.is_entity
, affording significant pos() speedup. Bumped Major version for these changes.
- Phrasal verbs ('step up'), firstnames and .people()
- Major file-size reduction through refactoring
- New NER choosing algorithm, better capitalisation logic, consolidated tests
- Sentence class methods, client-side demos