Skip to content

usec-official/fathom

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fathom

Fathom is a supervised-learning system for recognizing parts of web pages—pop-ups, address forms, slideshows—or for classifying a page as a whole. A DOM flows in one side, and DOM nodes flow out the other, tagged with types and probabilities that those types are correct. A Prolog-like language makes it straightforward to specify the “smells” that suggest each type, and a neural-net-based trainer determines the optimal contribution of each smell. Finally, the FathomFox web extension lets you collect and label a corpus of web pages for training.

Continue reading at https://mozilla.github.io/fathom/intro.html#why.

Documentation

About

A framework for extracting meaning from web pages

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 70.8%
  • Python 26.8%
  • Makefile 1.1%
  • HTML 1.1%
  • Shell 0.2%