inTIME: A Machine Learning-Based Framework for Gathering and Leveraging Web Data to Cyber-Threat Intelligence

DOI: 10.3390/electronics10070818

Key Functionality

Crawling from different web sources:
- Focused crawl: for discovering new sources of information
  - Uses machine-learning model
- In-depth crawl: for following the links in a specific domain e.g. forums
  - Uses link filters to limit the non-useful pages on each domain
Content Ranking & Classification of Harvested data:
- Calculation of relevance scores for the harvested web content with the help of machine-learning language models
- Classification of the harvested web content based on the relevance scores
Named Entity Recognition for finding actionable CTI in the harvested data:
- Uses rules and trained data to extract named entities from the relevant web content

Wikis

Who do I talk to?

This repository is maintained by Paris Koloveas from UoP

Email: [email protected]

Citing this work

If you utilize any of the processes and scripts in this repository, please cite us in the following way:

@Article{KCAST2021,
   AUTHOR     = {Koloveas, Paris
               AND Chantzios, Thanasis
               AND Alevizopoulou, Sofia
               AND Skiadopoulos, Spiros
               AND Tryfonopoulos, Christos},
   TITLE      = {inTIME: A Machine Learning-Based Framework for Gathering and Leveraging Web Data to Cyber-Threat Intelligence},
   JOURNAL    = {Electronics},
   VOLUME     = {10},
   YEAR       = {2021},
   NUMBER     = {7},
   ARTICLE-NUM= {818},
   URL        = {https://www.mdpi.com/2079-9292/10/7/818},
   ISSN       = {2079-9292},
   DOI        = {10.3390/electronics10070818}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ache-crawlers		ache-crawlers
content-ranking		content-ranking
mongo-docker/data		mongo-docker/data
named-entity-recognition		named-entity-recognition
rest-api/api-endpoints		rest-api/api-endpoints
watchers		watchers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
set_envs.sh		set_envs.sh
start_service.sh		start_service.sh
unset_envs.sh		unset_envs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

inTIME: A Machine Learning-Based Framework for Gathering and Leveraging Web Data to Cyber-Threat Intelligence

Key Functionality

Wikis

Who do I talk to?

Citing this work

About

Releases

Packages

Languages

License

pkoloveas/inTIME

Folders and files

Latest commit

History

Repository files navigation

inTIME: A Machine Learning-Based Framework for Gathering and Leveraging Web Data to Cyber-Threat Intelligence

Key Functionality

Wikis

Who do I talk to?

Citing this work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages