Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial AI exploration for the backend of the applications #50

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
30,931 changes: 30,931 additions & 0 deletions notebook/combined_embeddings.ipynb

Large diffs are not rendered by default.

58,590 changes: 58,590 additions & 0 deletions notebook/course_embeddings.ipynb

Large diffs are not rendered by default.

908 changes: 908 additions & 0 deletions notebook/data/Course.csv

Large diffs are not rendered by default.

95 changes: 95 additions & 0 deletions notebook/data/Program.csv

Large diffs are not rendered by default.

693 changes: 693 additions & 0 deletions notebook/data/ProgramCourse.csv

Large diffs are not rendered by default.

11 changes: 11 additions & 0 deletions notebook/data/ProgramType.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
"id","title"
697435,Maîtrise avec projet
738239,Microprogramme
697451,Maîtrise avec mémoire
915770,Concentration en technologies de la santé
697388,Doctorat
700093,Programme court
700092,Certificat
700095,DESS
700094,Cheminement universitaire en technologie (CUT)
699771,Baccalauréat
100,907 changes: 100,907 additions & 0 deletions notebook/data_exploration.ipynb

Large diffs are not rendered by default.

2,550 changes: 2,550 additions & 0 deletions notebook/data_science.ipynb

Large diffs are not rendered by default.

Binary file added notebook/db/db.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions notebook/db/prisma-erd.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added notebook/embedding.rar
Binary file not shown.
39 changes: 39 additions & 0 deletions notebook/embedding_evaluation_results.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Model Name,Silhouette Score,Number of Clusters,Inertia,Time Taken,dataframe
MiniLM,0.06354355,5,2104.3017578125,62.3960816860199,course
Multilingual MiniLM-L12,0.06972561,5,1927.947265625,88.9396481513977,course
MiniLM-L6,0.06067439,5,6040.69970703125,48.64740014076233,course
Multilingual MPNet,0.055970665,5,1884.258056640625,254.99050879478455,course
DistilRoBERTa,0.049180128,5,13973.9873046875,159.84893226623535,course
MiniLM-L12,0.04269758,5,3714.1708984375,98.60889649391174,course
MiniLM-L3,0.038517494,5,2678.4150390625,23.103832721710205,course
XLM-RoBERTa Base Multilingual,0.0772015,5,49950.2265625,277.00257778167725,course
STSB XLM-R Multilingual,0.0772015,5,49950.2265625,330.6788206100464,course
MS MARCO DistilBERT,0.0336069,5,5619.89453125,149.96267318725586,course
MS MARCO BERT,0.044978958,5,16743.794921875,338.83691787719727,course
T5 Small,0.14505102,5,204.54937744140625,77.6448757648468,course
T5 Base,0.06801192,5,996.11279296875,396.1023998260498,course
T5 Large,0.041167583,5,1061.4305419921875,1004.1282534599304,course
FLAN-T5 Small,0.10541956,5,162.1136932373047,71.80396556854248,course
FLAN-T5 Base,0.06095163,5,388.6905517578125,278.6274485588074,course
FLAN-T5 Large,0.10817401,5,129.44412231445312,915.0140993595123,course
RoBERTa Large v1,0.051263746,5,224872.84375,903.1814665794373,course
MPNet Base v2,0.06346093,5,3160.8056640625,266.6551468372345,course
MiniLM,0.06354355,5,2104.3017578125,35.63298296928406,combined
Multilingual MiniLM-L12,0.06972561,5,1927.947265625,59.94369435310364,combined
MiniLM-L6,0.06067439,5,6040.69921875,35.490906953811646,combined
Multilingual MPNet,0.055970665,5,1884.258056640625,177.7440221309662,combined
DistilRoBERTa,0.049180128,5,13973.9873046875,117.59407949447632,combined
MiniLM-L12,0.04269758,5,3714.1708984375,68.83575224876404,combined
MiniLM-L3,0.038517494,5,2678.4150390625,18.43731689453125,combined
XLM-RoBERTa Base Multilingual,0.0772015,5,49950.2265625,177.5923249721527,combined
STSB XLM-R Multilingual,0.0772015,5,49950.2265625,177.61623072624207,combined
MS MARCO DistilBERT,0.0336069,5,5619.89453125,105.29467010498047,combined
MS MARCO BERT,0.044978958,5,16743.794921875,209.93503665924072,combined
T5 Small,0.14505102,5,204.54937744140625,59.6424446105957,combined
T5 Base,0.06801192,5,996.11279296875,227.7394471168518,combined
T5 Large,0.041167583,5,1061.4306640625,745.5525488853455,combined
FLAN-T5 Small,0.10541956,5,162.1136932373047,67.67559218406677,combined
FLAN-T5 Base,0.06095163,5,388.6905517578125,252.0490207672119,combined
FLAN-T5 Large,0.10817401,5,129.44412231445312,818.083892583847,combined
RoBERTa Large v1,0.051263746,5,224872.84375,748.9289219379425,combined
MPNet Base v2,0.06346093,5,3160.8056640625,219.38725876808167,combined
842 changes: 842 additions & 0 deletions notebook/gnn.ipynb

Large diffs are not rendered by default.

2,690 changes: 2,690 additions & 0 deletions notebook/knn.ipynb

Large diffs are not rendered by default.

8,835 changes: 8,835 additions & 0 deletions notebook/program_embeddings.ipynb

Large diffs are not rendered by default.

Loading