Song lyrics search engine for Sinhalese language using Elasticsearch and python.
The corpus contains Sinhala songs with the following parameters. The corpus is created by using the https://sinhalasongbook.com/ website.
- 'Artist' - the artist
- 'Lyrics'- the lyrics artist
- 'Key'- guitar key
- 'Genre' - the genre of the song
- 'Music' - the musician of the song
- 'Song'- lyrics content
- 'Title' - the title of the song
- 'title-en' - title in English
- 'Visits' - the number of visits to the song
├── Corpus: original data, the processed data and the python script to put to Elasticsearch
├── Flask_app: Backend of the search engine
├── lyrics: Python scripts used to prepare the corpus
- Python, Flask, requests library and Elasticsearch needed in your PC.
- Clone the repository.
- Run an Elasticsearch instance on port 9200.
- Go to the folder Corpus. Run the python script send.py to put the corpus to the Elasticsearch.
- Then go to the Flask_app folder and open a terminal and run the following commands.
source venv/bin/activate
export FLASK_APP=controller
flask run
- Now you can create requests to below-mentioned endpoints.
Endpoint | Request Type | Functionality | Example |
---|---|---|---|
/searchBy | GET | Ruled based classification explained in the diagram | /searchBy term=වික්ටර් රත්නායක ගෙ ක්ලැසික් නුඹේ නමින් මා දුක්වී ගොතනා කවි වැල් නුඹට 2 |
/searchBy/param | GET | Search by specific filed with a term | /seachBy/param searchby=Title&term=බඹරෙකු හැඬුවා |
/searchBy/visits | GET | Range search by visits | searchBy/visits?visists=500,1000 |
/multisearchBy/param | GET | Multi-search with set of fields with a term | multiSearchBy/param?searchby=Song,Title,Artist&term=නුඹ වගෙයි |
/facetedSearch | POST | After selecting filters from received facets request to this with the term and selected filters | /facetedSearch { “term”: “බඹරෙකු හැඬුවා“, “filter” : [ {“Lyrics” : "උපාලි ධනවලවිථන"}, {“Artist”: “මිල්ටන් මල්ලවරාච්චි”}]} |
/advancedSeach | POST | Search with given set of fields including range of visit | {"filter" : [ {"keyword" : "Artist", "value": "ක්ලැරන්ස් විජේවර්ධන" }, {"keyword" : "Genre", "value": "පැරණි පොප්ස්" }, {"keyword" : "Music", "value": "ක්ලැරන්ස් විජේවර්ධන" }, {"keyword" : "Lyrics", "value": "ක්ලැරන්ස් විජේවර්ධන" } ], "range": {"gte" : 100, "lte": 1000}} |
Search text should be in Sinhala
- artist_key_words = ['ගෙ', 'ගැයූ','ගයන', 'කියන', 'කියූ', 'ගායනා']
- lyrics_key_words = ['ලියූ', 'ලියන', 'ලිව්ව', 'රචනා', 'රචිත', 'ලීව']
- music_key_words = ['සංගීත']
- genre_key_words = ['පොප්', 'ක්ලැසික්', 'දේවානුභාවයෙන්','පොප්ස්','ඩුවට්ස්','ක්ලැසික්','යුගල','ඕල්ඩීස්','කැලිප්සෝ','ළමා']