diff --git a/.gitignore b/.gitignore
index d44bb80..eae84f9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,3 @@
-assets
 metrics
 packages
 training
diff --git a/README.md b/README.md
index 305f043..d59fdae 100644
--- a/README.md
+++ b/README.md
@@ -1,31 +1,35 @@
 # greCy
 ## Ancient Greek models for spaCy
 
-This spaCy project trains seven ancient Greek models using the Perseus and Proiel  [Universal Dependency corpora](https://universaldependencies.org). Trained and already compiled wheel packages are already available on the [Hugging Face Hub](https://huggingface.co/Jacobo). Prior to installation, the models can be tested on my [Ancient Greek Syntax Analyzer](https://huggingface.co/spaces/Jacobo/syntax). In general the project gives priority to the Proiel training dataset as it is the corpus that produces more accurate and efficient models. 
+greCy is a set of spaCy ancient Greek models and its installer. The models were trained using the [Perseus](https://universaldependencies.org/treebanks/grc_perseus/index.html) and  [Proiel UD](https://universaldependencies.org/treebanks/grc_proiel/index.html) corpora. Prior to installation, the models can be tested on my [Ancient Greek Syntax Analyzer](https://huggingface.co/spaces/Jacobo/syntax) on the [Hugginge Face Hub](https://huggingface.co/), where you can also check the various performance metrics of each model.
 
-### Installation
+In general, models trained with the Proiel corpus perform better in POS Tagging and Dependency Parsing, while Perseus models  are better at sentence segmentation using punctuation, and Morphological Analysis. Lemmatization is similar across models because they share the same neural lemmatizer in two variants: the most accurate lemmatizer was trained with word vectors, and the other was not The best models for lemmatization are the large (_lg) models. 
 
-The models can be installed from the terminal with the commands below:
+### Installation
 
-**For the small model:**
+First install the python package as usual:
 
+``` bash
+pip install -U grecy
 ```
-pip install https://huggingface.co/Jacobo/grc_proiel_sm/resolve/main/grc_proiel_sm-any-py3-none-any.whl
-```
-**For the medium:**
 
-```
-pip install https://huggingface.co/Jacobo/grc_proiel_md/resolve/main/grc_proiel_md-any-py3-none-any.whl
-```
-**For the large:**
-```
-pip install https://huggingface.co/Jacobo/grc_perseus_lg/resolve/main/grc_perseus_lg-any-py3-none-any.whl
-```
-**For the transformer based:**
+Once the package is successfully installed, you can proceed any of the followings models:
+
+* grc_perseus_sm
+* grc_proiel_sm
+* grc_perseus_lg
+* grc_proiel_lg
+* grc_perseus_trf
+* grc_proiel_trf
+
+
+The models can be installed from the terminal with the commands below:
 
 ```
-pip install https://huggingface.co/Jacobo/grc_proiel_trf/resolve/main/grc_proiel_trf-any-py3-none-any.whl
+python -m grecy install MODEL
 ```
+where you replace MODEL by any of the model names listed above.  The suffixes after the corpus name _sm, _lg, and _trf indicate the size of the model which directly depend on the word embedding used to train the models. The smallest models end in _sm (small) and the less accurate ones: they are good for testing and building lightweight apps. The _lg and _trf are the large and transformers models which are more accurate. The _lg were trained using fasttext word vectors in the spaCy floret version, and the _trf models were using a special version of BERT, pertained by ourselves with the largest Ancient Greek corpus we could find (see more below). If you would like to work with word similarity, choose the _lg models.  The vectors for large models were trained with the TLG corpus using [floret](https://github.com/explosion/floret), a fork of [fastText](https://fasttext.cc/).
+
 
 ### Loading
 
@@ -37,11 +41,30 @@ nlp = spacy.load("grc_proiel_XX")
 ```
 Remember to replace  _XX  with the size of the model you would like to use, this means, _sm for small, _lg for large, and _trf for transformer. The _trf model is the most accurate but also the slowest.
 
-If you would like to work with word vectors, choose the large models.  The vectors for the large models were trained with the TLG corpus using [floret](https://github.com/explosion/floret), a fork of [fastText](https://fasttext.cc/).
+### Use
+
+spaCy is a powerful NLP library with many application. The most basic of its function is the morpho-syntantic annotation of texts for further processing. A common routine is to load a model, create a doc object, and process a text:
+
+```
+import spacy
+nlp = spacy.load("grc_proiel_sm")
+
+text = "καὶ πρὶν μὲν ἐν κακοῖσι κειμένην ὅμως ἐλπίς μʼ ἀεὶ προσῆγε σωθέντος τέκνου ἀλκήν τινʼ εὑρεῖν κἀπικούρησιν δόμον"
+
+doc = nlp(text)
+
+for token in doc:
+    print(f'{token.text}, lemma: {token.lemma_} pos:{token.pos_}')
+    
+```
+
+#### The apostrophe issue
+
+Unfortunaly, there is no consensus among the different internet projects that offer ancient Greek texts about how to represent the Ancient Greek apostrophe. Modern Greek simply uses the regular apostrophe, but ancient texts available in Perseus and Perseus under Philologic use various unicode characters for the apostrophe. Instead of the apostrophe, we find the Greek koronis, modifier letter apostrophe, and right single quotation mark. Provisionally, I have opted to use modifier letter apostrophe in the corpus  with which I trained the models. This means, that if you want the greCy models to properly handle the apostrophe you have to make sure that the Ancient Greek texts that you are processing use the modifier letter apostrophe **ʼ** (U+02BC ). Otherwise the models will fail to lemmatize and tag some words in your texts that ends with an 'apostrophe'.
 
 ### Building
 
-The four standard spaCy models (small, medium, large, and transformer) are built and packaged using the following commands:
+I offer here the project file, I use to train the models in case you want to customize your models for your specific needs. The six standard spaCy models (small, large, and transformer) are built and packaged using the following commands:
 
 
 1. python -m spacy project assets
@@ -49,15 +72,32 @@ The four standard spaCy models (small, medium, large, and transformer) are built
 
 ### Performance
 
+For a general comparison, I share here the metrics of the Proiel transformer grc_proiel_trf and grc_perseys_trf.  These models use for fine-tuning a transformer that was specifically trained to be used with spaCy and, consequently, makes the model much smaller than the alternatives offered by Python nlp libraries such as Stanza and Trankit (for more information on the transformer model and how it was trained see [aristoBERTo](https://huggingface.co/Jacobo/aristoBERTo)).  The greCy's _trf models outperform Stanza and Trankit in most metrics and have the advantage that their size is only ~430 MB vs.  the 1.2 GB of the Trankit model trained with XLM Roberta.  See table  below:
 
-The Proiel_trf model uses for fine-tuning a transformer that was specifically trained to be used with spaCy and, consequently, makes the model much smaller than the alternatives offered by Python nlp libraries like Stanza and Trankit (for more information on the transformer model and how it was trained see [AristoBERTo](https://huggingface.co/Jacobo/aristoBERTo)).  The spaCy _trf model outperforms  Stanza and Trankit in most metrics and has the advantage that its size is only 662 MB vs.  the 1.2 GB of the Trankit model trained with XLM Roberta. For a comparison, see table  below:
+#### Proiel
 
 | Library | Tokens	| Sentences	| UPOS	| XPOS	| UFeats	|Lemmas	|UAS	  |LAS	  |
 |  ---    | ---     | ---       | ---   | ---   | ---     | ---   | ---   | ---   |
-| spaCy   | 100     | 71.90 | 98.50 | 98.40 | 94.10 | 96.9 | 85.90 | 82.50 |
+| spaCy   | 100     | 71.74 | 98.11 | 98.21 | 93.91 | 96.69 | 85.59 | 82.30 |
 | Trankit | 99.91 	| 67.60     |97.86 	| 97.93 |93.03 	  | 97.50 |85.63 	|82.31  |
 | Stanza  | 100	    | 51.65	    | 97.38	| 97.75	| 92.09	  | 97.42	| 80.34 |76.33  |
 
+#### Perseus
+
+| Library | Tokens	| Sentences	| UPOS	| XPOS	| UFeats	|Lemmas	|UAS	  |LAS	  |
+|  ---    | ---     | ---       | ---   | ---   | ---     | ---   | ---   | ---   |
+| spaCy   | 100     | 99.38     | 95.83 | 95.92 | 94.79 | 97.23 | 80.93 | 75.74 |
+| Trankit | 99.71 | 98.70 |93.97 	| 87.25 |91.66 	  | 88.52  |83.48 	|78.56  |
+| Stanza  | 99.8	 | 98.85	| 92.54	| 85.22	| 91.06	| 88.26	| 78.75 |73.35  |
+
+### Caveat 
+
+Metrics, however, can be misleading. This becomes particularly obvious when you work with texts that are not part of the training and evaluation dataset. In addition, greCy's lemmatizers (in all sizes) exhibit lower benchmarks in comparison to the above mentioned nlp libraries, but they have a substantially larger vocabulary than the Stanza and Trankit models because they were trained with a complemental lemma corpus derived from Giussepe G.A. Celano [lemmatized corpus](https://github.com/gcelano/LemmatizedAncientGreekXML). This means that the greCy's lemmatizers perform better than Trankit and Stanza when processing texts not included in the Perseus and Proiel datasets. 
+
+### Future Developments
+
+This project was initiated as part of the [Diogenet Project](https://diogenet.ucsd.edu/), a research initiative that focuses on the automatic extraction of social relations from Ancient Greek texts. As part of this project, greCy will add first, in a non distant future,  a NER pipeline for the identification of entities; later I hope also to offer pipeline for the extraction of social relation from Greek texts. This pipeline should contribute to the study of social networks in the ancient world. 
+
 
 
 Metrics, however, can be misleading. This becomes particularly obvious when you work with texts that are not part  of the training dataset. In addition, greCy's lemmatizers (in all sizes) exhibit lower benchmarks but have a substantially larger vocabulary than the Stanza and Trankit models because they were trained with a complemental lemma corpus derived from Giussepe G.A. Celano [lemmatized corpus](https://github.com/gcelano/LemmatizedAncientGreekXML). This means that the greCy's lemmatizers perform better than Trankit and Stanza when processing texts not included in the Perseus and Proiel datasets. 
diff --git a/assets/UD_Ancient_Greek-PROIEL b/assets/UD_Ancient_Greek-PROIEL
new file mode 160000
index 0000000..8615444
--- /dev/null
+++ b/assets/UD_Ancient_Greek-PROIEL
@@ -0,0 +1 @@
+Subproject commit 86154446b88b085ed9ad23e5907170198c0395b2
diff --git a/assets/UD_Ancient_Greek-Perseus b/assets/UD_Ancient_Greek-Perseus
new file mode 160000
index 0000000..ef86628
--- /dev/null
+++ b/assets/UD_Ancient_Greek-Perseus
@@ -0,0 +1 @@
+Subproject commit ef866281b1af25a5a6c06959eee3aa05467afba8
diff --git a/configs/large.cfg b/configs/large.cfg
index 919d3e1..0a570c4 100644
--- a/configs/large.cfg
+++ b/configs/large.cfg
@@ -11,7 +11,7 @@ seed = 0
 
 [nlp]
 lang = "grc"
-pipeline = ["tok2vec","morphologizer","tagger","parser","senter","lemmatizer","attribute_ruler"]
+pipeline = ["tok2vec","morphologizer","tagger","parser","lemmatizer","attribute_ruler"]
 batch_size = 128
 disabled = []
 before_creation = null
@@ -25,7 +25,6 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
 source = "./training/lemmatizer/large/model-best"
 replace_listeners = ["model.tok2vec"]
 
-
 [components.attribute_ruler]
 factory = "attribute_ruler"
 scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
@@ -68,9 +67,6 @@ nO = null
 width = ${components.tok2vec.model.encode.width}
 upstream = "tok2vec"
 
-[components.senter]
-source = "./training/senter/large/model-best"
-
 [components.tagger]
 factory = "tagger"
 overwrite = false
@@ -143,7 +139,7 @@ patience = 5000
 max_epochs = 0
 max_steps = 20000
 eval_frequency = 200
-frozen_components = ["lemmatizer","senter"]
+frozen_components = ["lemmatizer"]
 annotating_components = []
 before_to_disk = null
 
@@ -161,18 +157,18 @@ compound = 1.001
 t = 0.0
 
 
-# [training.logger]
-# @loggers = "spacy.WandbLogger.v3"
-# project_name = "proiel"
-# remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
-# log_dataset_dir = "./corpus"
-# model_log_interval = 1000
-# entity = null
-# run_name = null
-
 [training.logger]
-@loggers = "spacy.ConsoleLogger.v1"
-progress_bar = false
+@loggers = "spacy.WandbLogger.v3"
+project_name = "greCy"
+remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+log_dataset_dir = "./corpus"
+model_log_interval = 1000
+entity = null
+run_name = null
+
+# [training.logger]
+# @loggers = "spacy.ConsoleLogger.v1"
+# progress_bar = false
 
 [training.optimizer]
 @optimizers = "Adam.v1"
diff --git a/configs/lemmatizer_sm.cfg b/configs/lemmatizer_sm.cfg
index eaf1469..7db0ee3 100644
--- a/configs/lemmatizer_sm.cfg
+++ b/configs/lemmatizer_sm.cfg
@@ -11,7 +11,7 @@ seed = 0
 [nlp]
 lang = "grc"
 pipeline = ["lemmatizer"]
-batch_size = 32
+batch_size = 64
 disabled = []
 before_creation = null
 after_creation = null
@@ -109,18 +109,18 @@ stop = 1000
 compound = 1.001
 t = 0.0
 
-[training.logger]
-@loggers = "spacy.ConsoleLogger.v1"
-progress_bar = false
-
 # [training.logger]
-# @loggers = "spacy.WandbLogger.v3"
-# project_name = "lemmatizer"
-# remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
-# log_dataset_dir = "./corpus"
-# model_log_interval = 1000
-# entity = null
-# run_name = null
+# @loggers = "spacy.ConsoleLogger.v1"
+# progress_bar = false
+
+[training.logger]
+@loggers = "spacy.WandbLogger.v3"
+project_name = "lemmatizer"
+remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+log_dataset_dir = "./corpus"
+model_log_interval = 1000
+entity = null
+run_name = null
 
 [training.optimizer]
 @optimizers = "Adam.v1"
diff --git a/configs/lemmatizer_trf.cfg b/configs/lemmatizer_trf.cfg
index a7ec56e..79071b9 100644
--- a/configs/lemmatizer_trf.cfg
+++ b/configs/lemmatizer_trf.cfg
@@ -102,18 +102,18 @@ size = 2000
 buffer = 256
 get_length = null
 
-[training.logger]
-@loggers = "spacy.ConsoleLogger.v1"
-progress_bar = false
-
 # [training.logger]
-# @loggers = "spacy.WandbLogger.v3"
-# project_name = "lemmatizer"
-# remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
-# log_dataset_dir = "./corpus"
-# model_log_interval = 1000
-# entity = null
-# run_name = null
+# @loggers = "spacy.ConsoleLogger.v1"
+# progress_bar = false
+
+[training.logger]
+@loggers = "spacy.WandbLogger.v3"
+project_name = "lemmatizer"
+remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+log_dataset_dir = "./corpus"
+model_log_interval = 1000
+entity = null
+run_name = null
 
 [training.optimizer]
 @optimizers = "Adam.v1"
@@ -146,4 +146,4 @@ after_init = null
 
 [initialize.components]
 
-[initialize.tokenizer]
\ No newline at end of file
+[initialize.tokenizer]
diff --git a/configs/lemmatizer_vec.cfg b/configs/lemmatizer_vec.cfg
index c1d9b75..fd55c59 100644
--- a/configs/lemmatizer_vec.cfg
+++ b/configs/lemmatizer_vec.cfg
@@ -11,7 +11,7 @@ seed = 0
 [nlp]
 lang = "grc"
 pipeline = ["lemmatizer"]
-batch_size = 32
+batch_size = 64
 disabled = []
 before_creation = null
 after_creation = null
@@ -109,17 +109,17 @@ stop = 1000
 compound = 1.001
 t = 0.0
 
-[training.logger]
-@loggers = "spacy.ConsoleLogger.v1"
-progress_bar = false
-
 # [training.logger]
-# @loggers = "spacy.WandbLogger.v3"
-# project_name = "lemmatizer"
-# remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
-# log_dataset_dir = "./corpus"
-# model_log_interval = 1000
-# entity = null
+# @loggers = "spacy.ConsoleLogger.v1"
+# progress_bar = false
+
+[training.logger]
+@loggers = "spacy.WandbLogger.v3"
+project_name = "lemmatizer"
+remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+log_dataset_dir = "./corpus"
+model_log_interval = 1000
+entity = null
 run_name = null
 
 [training.optimizer]
diff --git a/configs/small.cfg b/configs/small.cfg
index 4796bde..6ebd3d9 100644
--- a/configs/small.cfg
+++ b/configs/small.cfg
@@ -10,7 +10,7 @@ seed = 0
 
 [nlp]
 lang = "grc"
-pipeline = ["tok2vec","morphologizer","tagger","parser","senter","lemmatizer","attribute_ruler"]
+pipeline = ["tok2vec","morphologizer","tagger","parser","lemmatizer","attribute_ruler"]
 batch_size = 128
 disabled = []
 before_creation = null
@@ -23,9 +23,6 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
 [components.lemmatizer]
 source = "./training/lemmatizer/small/model-best"
 
-[components.senter]
-source = "./training/senter/small/model-best"
-
 [components.attribute_ruler]
 factory = "attribute_ruler"
 scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
@@ -134,7 +131,7 @@ patience = 5000
 max_epochs = 0
 max_steps = 20000
 eval_frequency = 200 
-frozen_components = ["lemmatizer","senter"]
+frozen_components = ["lemmatizer"]
 annotating_components = []
 before_to_disk = null
 
@@ -151,18 +148,18 @@ stop = 1000
 compound = 1.001
 t = 0.0
 
-[training.logger]
- @loggers = "spacy.ConsoleLogger.v1"
- progress_bar = false
-
 # [training.logger]
-# @loggers = "spacy.WandbLogger.v3"
-# project_name = "proiel"
-# remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
-# log_dataset_dir = "./corpus"
-# model_log_interval = 1000
-# entity = null
-# run_name = null
+#  @loggers = "spacy.ConsoleLogger.v1"
+#  progress_bar = false
+
+[training.logger]
+@loggers = "spacy.WandbLogger.v3"
+project_name = "greCy"
+remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+log_dataset_dir = "./corpus"
+model_log_interval = 1000
+entity = null
+run_name = null
 
 [training.optimizer]
 @optimizers = "Adam.v1"
diff --git a/configs/transformer_perseus.cfg b/configs/transformer_perseus.cfg
new file mode 100644
index 0000000..8de2d2f
--- /dev/null
+++ b/configs/transformer_perseus.cfg
@@ -0,0 +1,246 @@
+[paths]
+train = null
+dev = null
+vectors = null
+init_tok2vec = null
+
+[system]
+gpu_allocator = "pytorch"
+seed = 0
+
+[nlp]
+lang = "grc"
+pipeline = ["transformer","morphologizer","tagger","senter","parser","lemmatizer","attribute_ruler"]
+batch_size = 128
+disabled = []
+before_creation = null
+after_creation = null
+after_pipeline_creation = null
+tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
+
+[components]
+
+[components.lemmatizer]
+source = "./training/lemmatizer/small/model-best"
+
+[components.senter]
+factory = "senter"
+overwrite = false
+scorer = {"@scorers":"spacy.senter_scorer.v1"}
+
+[components.senter.model]
+@architectures = "spacy.Tagger.v2"
+nO = null
+normalize = false
+
+[components.senter.model.tok2vec]
+@architectures = "spacy.HashEmbedCNN.v2"
+pretrained_vectors = true
+width = 12
+depth = 1
+embed_size = 2000
+window_size = 1
+maxout_pieces = 2
+subword_features = true
+
+[components.morphologizer]
+factory = "morphologizer"
+extend = false
+overwrite = true
+scorer = {"@scorers":"spacy.morphologizer_scorer.v1"}
+
+[components.morphologizer.model]
+@architectures = "spacy.Tagger.v2"
+nO = null
+normalize = false
+
+[components.morphologizer.model.tok2vec]
+@architectures = "spacy-transformers.TransformerListener.v1"
+grad_factor = 1.0
+pooling = {"@layers":"reduce_mean.v1"}
+upstream = "*"
+
+[components.parser]
+factory = "parser"
+learn_tokens = false
+min_action_freq = 30
+moves = null
+scorer = {"@scorers":"spacy.parser_scorer.v1"}
+update_with_oracle_cut_size = 100
+
+[components.parser.model]
+@architectures = "spacy.TransitionBasedParser.v2"
+state_type = "parser"
+extra_state_tokens = false
+hidden_width = 128
+maxout_pieces = 3
+use_upper = false
+nO = null
+
+[components.parser.model.tok2vec]
+@architectures = "spacy-transformers.TransformerListener.v1"
+grad_factor = 1.0
+pooling = {"@layers":"reduce_mean.v1"}
+upstream = "*"
+
+[components.tagger]
+factory = "tagger"
+neg_prefix = "!"
+overwrite = false
+scorer = {"@scorers":"spacy.tagger_scorer.v1"}
+
+[components.tagger.model]
+@architectures = "spacy.Tagger.v2"
+nO = null
+normalize = false
+
+[components.tagger.model.tok2vec]
+@architectures = "spacy-transformers.TransformerListener.v1"
+grad_factor = 1.0
+pooling = {"@layers":"reduce_mean.v1"}
+upstream = "*"
+
+[components.attribute_ruler]
+factory = "attribute_ruler"
+scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
+validate = false
+
+[components.transformer]
+factory = "transformer"
+max_batch_items = 4096
+set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}
+
+[components.transformer.model]
+@architectures = "spacy-transformers.TransformerModel.v3"
+name = "Jacobo/aristoBERTo"
+mixed_precision = false
+
+[components.transformer.model.get_spans]
+@span_getters = "spacy-transformers.strided_spans.v1"
+window = 128
+stride = 96
+
+[components.transformer.model.grad_scaler_config]
+
+[components.transformer.model.tokenizer_config]
+use_fast = true
+
+[components.transformer.model.transformer_config]
+
+[corpora]
+
+[corpora.dev]
+@readers = "spacy.Corpus.v1"
+path = ${paths.dev}
+max_length = 0
+gold_preproc = false
+limit = 0
+augmenter = null
+
+[corpora.train]
+@readers = "spacy.Corpus.v1"
+path = ${paths.train}
+max_length = 0
+gold_preproc = false
+limit = 0
+augmenter = null
+
+[training]
+accumulate_gradient = 3
+dev_corpus = "corpora.dev"
+train_corpus = "corpora.train"
+seed = ${system.seed}
+gpu_allocator = ${system.gpu_allocator}
+dropout = 0.1
+patience = 5000
+max_epochs = 0
+max_steps = 20000
+eval_frequency = 200
+frozen_components = ["lemmatizer"]
+annotating_components = ["lemmatizer"]
+before_to_disk = null
+before_update = null
+
+[training.batcher]
+@batchers = "spacy.batch_by_padded.v1"
+discard_oversize = true
+size = 2000
+buffer = 256
+get_length = null
+
+
+[training.logger]
+ @loggers = "spacy.WandbLogger.v3"
+ project_name = "greCy"
+ remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+ log_dataset_dir = "./corpus"
+ model_log_interval = 1000
+ entity = null
+ run_name = null
+
+# [training.logger]
+# @loggers = "spacy.ConsoleLogger.v1"
+# progress_bar = false
+
+[training.optimizer]
+@optimizers = "Adam.v1"
+beta1 = 0.9
+beta2 = 0.999
+L2_is_weight_decay = true
+L2 = 0.01
+grad_clip = 1.0
+use_averages = false
+eps = 0.00000001
+
+[training.optimizer.learn_rate]
+@schedules = "warmup_linear.v1"
+warmup_steps = 250
+total_steps = 20000
+initial_rate = 0.00005
+
+[training.score_weights]
+pos_acc = 0.12
+morph_acc = 0.12
+morph_per_feat = null
+tag_acc = 0.25
+sents_f = 0.0
+sents_p = null
+sents_r = null
+dep_uas = 0.12
+dep_las = 0.12
+dep_las_per_type = null
+lemma_acc = 1.0
+
+[pretraining]
+
+[initialize]
+vectors = ${paths.vectors}
+init_tok2vec = ${paths.init_tok2vec}
+vocab_data = null
+lookups = null 
+before_init = null
+after_init = null
+
+[initialize.components]
+
+[initialize.components.parser]
+
+[initialize.components.parser.labels]
+@readers = "spacy.read_labels.v1"
+path = "corpus/labels/parser.json"
+require = false
+
+[initialize.components.tagger]
+
+[initialize.components.tagger.labels]
+@readers = "spacy.read_labels.v1"
+path = "corpus/labels/tagger.json"
+require = false
+
+[initialize.components.attribute_ruler]
+
+[initialize.components.attribute_ruler.patterns]
+@readers = "srsly.read_json.v1"
+path = "data/augments/attribute_ruler_patterns.json"
+
+[initialize.tokenizer]
\ No newline at end of file
diff --git a/configs/transformer.cfg b/configs/transformer_proiel.cfg
similarity index 82%
rename from configs/transformer.cfg
rename to configs/transformer_proiel.cfg
index a04ea21..312b88f 100644
--- a/configs/transformer.cfg
+++ b/configs/transformer_proiel.cfg
@@ -21,15 +21,27 @@ tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
 [components]
 
 [components.lemmatizer]
-source = "training/lemmatizer/large/model-best"
+source = "./training/lemmatizer/small/model-best"
 
 [components.senter]
-source = "training/senter/large/model-best"
+factory = "senter"
+overwrite = false
+scorer = {"@scorers":"spacy.senter_scorer.v1"}
 
-[components.attribute_ruler]
-factory = "attribute_ruler"
-scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
-validate = false
+[components.senter.model]
+@architectures = "spacy.Tagger.v2"
+nO = null
+normalize = false
+
+[components.senter.model.tok2vec]
+@architectures = "spacy.HashEmbedCNN.v2"
+pretrained_vectors = true
+width = 12
+depth = 1
+embed_size = 2000
+window_size = 1
+maxout_pieces = 2
+subword_features = true
 
 [components.morphologizer]
 factory = "morphologizer"
@@ -59,8 +71,8 @@ update_with_oracle_cut_size = 100
 @architectures = "spacy.TransitionBasedParser.v2"
 state_type = "parser"
 extra_state_tokens = false
-hidden_width = 64
-maxout_pieces = 2
+hidden_width = 128
+maxout_pieces = 3
 use_upper = false
 nO = null
 
@@ -86,6 +98,11 @@ grad_factor = 1.0
 pooling = {"@layers":"reduce_mean.v1"}
 upstream = "transformer"
 
+[components.attribute_ruler]
+factory = "attribute_ruler"
+scorer = {"@scorers":"spacy.attribute_ruler_scorer.v1"}
+validate = false
+
 [components.transformer]
 factory = "transformer"
 max_batch_items = 4096
@@ -137,9 +154,10 @@ patience = 5000
 max_epochs = 0
 max_steps = 20000
 eval_frequency = 200
-frozen_components = ["lemmatizer","senter"]
-annotating_components = []
+frozen_components = ["lemmatizer"]
+annotating_components = ["lemmatizer"]
 before_to_disk = null
+before_update = null
 
 [training.batcher]
 @batchers = "spacy.batch_by_padded.v1"
@@ -148,9 +166,19 @@ size = 2000
 buffer = 256
 get_length = null
 
+
 [training.logger]
-@loggers = "spacy.ConsoleLogger.v1"
-progress_bar = false
+ @loggers = "spacy.WandbLogger.v3"
+ project_name = "greCy"
+ remove_config_values = ["paths.train","paths.dev","corpora.train.path","corpora.dev.path"]
+ log_dataset_dir = "./corpus"
+ model_log_interval = 1000
+ entity = null
+ run_name = null
+
+# [training.logger]
+# @loggers = "spacy.ConsoleLogger.v1"
+# progress_bar = false
 
 # [training.logger]
 # @loggers = "spacy.WandbLogger.v3"
@@ -168,7 +196,7 @@ beta2 = 0.999
 L2_is_weight_decay = true
 L2 = 0.01
 grad_clip = 1.0
-use_averages = true
+use_averages = false
 eps = 0.00000001
 
 [training.optimizer.learn_rate]
@@ -178,16 +206,16 @@ total_steps = 20000
 initial_rate = 0.00005
 
 [training.score_weights]
-pos_acc = 0.25
-morph_acc = 0.25
+pos_acc = 0.12
+morph_acc = 0.12
 morph_per_feat = null
-tag_acc = 0.08
-dep_uas = 0.0
-dep_las = 0.08
-dep_las_per_type = null
+tag_acc = 0.25
+sents_f = 0.0
 sents_p = null
 sents_r = null
-sents_f = null
+dep_uas = 0.12
+dep_las = 0.12
+dep_las_per_type = null
 lemma_acc = 1.0
 
 [pretraining]
@@ -196,7 +224,7 @@ lemma_acc = 1.0
 vectors = ${paths.vectors}
 init_tok2vec = ${paths.init_tok2vec}
 vocab_data = null
-lookups = null
+lookups = null 
 before_init = null
 after_init = null
 
diff --git a/corpus/dev/grc_perseus-ud-dev.spacy b/corpus/dev/grc_perseus-ud-dev.spacy
new file mode 100644
index 0000000..7e71bd4
Binary files /dev/null and b/corpus/dev/grc_perseus-ud-dev.spacy differ
diff --git a/corpus/dev/grc_proiel-ud-dev.spacy b/corpus/dev/grc_proiel-ud-dev.spacy
new file mode 100644
index 0000000..6b6b6a0
Binary files /dev/null and b/corpus/dev/grc_proiel-ud-dev.spacy differ
diff --git a/corpus/dev/lemma_dev.spacy b/corpus/dev/lemma_dev.spacy
index 157920d..aa660f7 100644
Binary files a/corpus/dev/lemma_dev.spacy and b/corpus/dev/lemma_dev.spacy differ
diff --git a/corpus/train/grc_perseus-ud-train.spacy b/corpus/train/grc_perseus-ud-train.spacy
new file mode 100644
index 0000000..8f77653
Binary files /dev/null and b/corpus/train/grc_perseus-ud-train.spacy differ
diff --git a/corpus/train/grc_proiel-ud-train.spacy b/corpus/train/grc_proiel-ud-train.spacy
new file mode 100644
index 0000000..358e7c5
Binary files /dev/null and b/corpus/train/grc_proiel-ud-train.spacy differ
diff --git a/corpus/train/lemma_train.spacy b/corpus/train/lemma_train.spacy
index 54c28a5..bf507a9 100644
Binary files a/corpus/train/lemma_train.spacy and b/corpus/train/lemma_train.spacy differ
diff --git a/project.yml b/project.yml
index aacb698..f4a8a34 100644
--- a/project.yml
+++ b/project.yml
@@ -1,17 +1,18 @@
 title: "Ancient Greek PROIEL and PERSEUS MODELS"
-description: "This project trains four spaCy models using the ancient Greek Universal Dependency treebanks Proiel and Perseus."
+description: "This project trains seven spaCy models using the ancient Greek Universal Dependency treebanks Proiel and Perseus."
 spacy_version: ">=3.5,<4.0.0"
 email: "jmyerston@ucsd.edu"
 source: "https://universaldependencies.org/"
 vars:
-  license: "cc-by-sa-3.0"
+  license: "MIT"
   author: "Jacobo Myerston"
   config_ssm: "senter"
   config_slg: "senter_vec"
   config_sm: "small"
   config_md: "medium"
   config_lg: "large"
-  config_trf: "transformer"
+  config_trf_proiel: "transformer_proiel"
+  config_trf_perseus: "transformer_perseus"
   config_lsm: "lemmatizer_sm"
   config_llg: "lemmatizer_vec"
   config_ltrf: "lemmatizer_trf"
@@ -29,7 +30,7 @@ vars:
   package_name_perseus: "perseus"
   package_name_grecy: "greCy_lemmatizer"
 
-  package_version: "3.5.2"
+  package_version: "3.5.3"
   gpu: 0
 
 # These are the directories that the project needs. The project CLI will make
@@ -56,8 +57,6 @@ assets:
 workflows:
   all:
     - preprocess
-    - train-senter-small
-    - train-senter-large
     - train-lemmatizer-small
     - train-lemmatizer-large
   #  - train-lemmatizer-greCy
@@ -111,30 +110,6 @@ commands:
       - "corpus/test/grc_proiel-ud-test.spacy"
       - "vectors/${vars.vectors_lg}"
 
-  - name: train-senter-small
-    help: "Train senter component without vectors"
-    script:
-      - "mkdir -p training/senter"
-      - "python -m spacy train configs/senter.cfg --output training/senter/small --gpu-id ${vars.gpu} --paths.train corpus/train/grc_perseus-ud-train.spacy --paths.dev corpus/dev/grc_perseus-ud-dev.spacy --nlp.lang=${vars.lang}"
-    deps:
-      - "corpus/train/grc_perseus-ud-train.spacy"
-      - "corpus/dev/grc_perseus-ud-dev.spacy"
-      - "configs/${vars.config_ssm}.cfg"
-    outputs:
-    - "training/senter/small"
-
-  - name: train-senter-large
-    help: "Train senter component without vectors"
-    script:
-      - "mkdir -p training/senter"
-      - "python -m spacy train configs/${vars.config_slg}.cfg --output training/senter/large --gpu-id ${vars.gpu} --paths.train corpus/train/grc_perseus-ud-train.spacy --paths.dev corpus/dev/grc_perseus-ud-dev.spacy --paths.vectors vectors/large --nlp.lang=${vars.lang}"
-    deps:
-      - "corpus/train/grc_perseus-ud-train.spacy"
-      - "corpus/dev/grc_perseus-ud-dev.spacy"
-      - "configs/${vars.config_slg}.cfg"
-    outputs:
-    - "training/senter/large"
-
   - name: train-lemmatizer-small
     help: "Train the lemmatizer for the small model"
     script:
@@ -289,11 +264,11 @@ commands:
     help: "Train ${vars.treebank_proiel}"
     script:
       - "mkdir -p training/transformer/proiel/assembled"
-      - "python -m spacy train configs/${vars.config_trf}.cfg --output training/transformer/proiel/assembled --gpu-id ${vars.gpu} --paths.train corpus/train/grc_proiel-ud-train.spacy --paths.dev corpus/dev/grc_proiel-ud-dev.spacy --nlp.lang=${vars.lang}"
+      - "python -m spacy train configs/${vars.config_trf_proiel}.cfg --output training/transformer/proiel/assembled --gpu-id ${vars.gpu} --paths.train corpus/train/grc_proiel-ud-train.spacy --paths.dev corpus/dev/grc_proiel-ud-dev.spacy --nlp.lang=${vars.lang}"
     deps:
       - "corpus/train/grc_proiel-ud-train.spacy"
       - "corpus/dev/grc_proiel-ud-dev.spacy"
-      - "configs/${vars.config_trf}.cfg"
+      - "configs/${vars.config_trf_proiel}.cfg"
     outputs:
       - "training/transformer/proiel/assembled"
 
@@ -320,11 +295,11 @@ commands:
     help: "Train ${vars.treebank_perseus}"
     script:
       - "mkdir -p training/transformer/perseus/assembled"
-      - "python -m spacy train configs/${vars.config_trf}.cfg --output training/transformer/perseus/assembled --gpu-id ${vars.gpu} --paths.train corpus/train/grc_perseus-ud-train.spacy --paths.dev corpus/dev/grc_perseus-ud-dev.spacy --nlp.lang=${vars.lang}"
+      - "python -m spacy train configs/${vars.config_trf_perseus}.cfg --output training/transformer/perseus/assembled --gpu-id ${vars.gpu} --paths.train corpus/train/grc_perseus-ud-train.spacy --paths.dev corpus/dev/grc_perseus-ud-dev.spacy --nlp.lang=${vars.lang}"
     deps:
       - "corpus/train/grc_perseus-ud-train.spacy"
       - "corpus/dev/grc_perseus-ud-dev.spacy"
-      - "configs/${vars.config_trf}.cfg"
+      - "configs/${vars.config_trf_perseus}.cfg"
     outputs:
       - "training/transformer/perseus/assembled"
 
@@ -351,7 +326,7 @@ commands:
     help: "Train the lemmatizer for the medium model"
     script:
       - "mkdir -p training/lemmatizer/greCy_lemmatizer"
-      - "python -m spacy train configs/${vars.config_ltrf}.cfg --output ./training/lemmatizer/greCy_lemmatizer --gpu-id ${vars.gpu} --paths.train corpus/train --paths.dev corpus/dev --nlp.lang=${vars.lang}"
+      - "python -m spacy train configs/${vars.config_ltrf}.cfg  --output ./training/lemmatizer/greCy_lemmatizer --gpu-id ${vars.gpu} --paths.train corpus/train --paths.dev corpus/dev --nlp.lang=${vars.lang}"
     deps:
       - "corpus/train/grc_perseus-ud-train.spacy"
       - "corpus/train/grc_perseus-ud-train.spacy"
diff --git a/requirements.txt b/requirements.txt
new file mode 100644
index 0000000..e33adc3
--- /dev/null
+++ b/requirements.txt
@@ -0,0 +1,3 @@
+spacy>=3.5.3
+wandb
+spacy-huggingface-hub
\ No newline at end of file