Skip to content

Commit

Permalink
add finetune nb example
Browse files Browse the repository at this point in the history
  • Loading branch information
king-menin committed Mar 9, 2021
1 parent 54c0fe4 commit 3f4ee27
Showing 1 changed file with 250 additions and 0 deletions.
250 changes: 250 additions & 0 deletions examples/ruGPT3XL_finetune_example.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"\n",
"%autoreload 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run finetuning"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Prepare your files and split to shards. All files will be splitted between all gpus. All paths to text files should be stored at file.list\n",
"\n",
"Example of `files.list` for single gpu:\n",
"\n",
"```text\n",
"data/train.txt\n",
"```\n",
"\n",
"* Add your path to template script in our repo [scripts/deepspeed_gpt3_xl_finetune.sh](https://github.com/sberbank-ai/ru-gpts/blob/master/scripts/deepspeed_gpt3_xl_finetune.sh)\n",
"\n",
"* `cd scripts` for go to scripts dir\n",
"\n",
"* run command\n",
"\n",
"```bash\n",
"sh deepspeed_gpt3_xl_finetune.sh\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7TbybJfIpBVa"
},
"source": [
"### Test model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Load model"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "EnTy1SEajpPV"
},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "Z-DSEz0ljpPV"
},
"outputs": [],
"source": [
"import sys\n",
"sys.path.append(\"../../ru-gpts/\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "_9GABoxNVpH4"
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"USE_DEEPSPEED\"] = \"1\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_HeCFnJEjpPV"
},
"outputs": [],
"source": [
"from src.xl_wrapper import RuGPT3XL"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OdyughHDjpPV"
},
"source": [
"Note! seq_len is max sequence length for generation used in generation process. Max avialable seq_len is 2048 (in tokens).\n",
"Also inference takes around 10 Gb GPU memory."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "56aNJNPYjpPW",
"outputId": "aa154214-82b6-4604-d0b7-8ad7629aa704",
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> initializing model parallel with size 1\n",
"> initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 3952 and data parallel seed: 1234\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/user/conda/lib/python3.7/site-packages/ipykernel/ipkernel.py:283: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.\n",
" and should_run_async(code)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Use alternating sparse & dense attention layers\n"
]
}
],
"source": [
"gpt = RuGPT3XL.from_pretrained(\n",
" \"sberbank-ai/rugpt3xl\",\n",
" weights_path=\"/path/2/save/model/train-iters/mp_rank_00_model_states.pt\",\n",
" deepspeed_config_path=\"../src/deepspeed_config/gpt3_xl_sparse_2048.json\",\n",
" seq_len=512\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Example of generation"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "7Qw65CVzjpPZ",
"outputId": "79a31fb8-656d-4923-f208-1589d675d7c7"
},
"outputs": [],
"source": [
"def generate(text, additional_len=20):\n",
" return filter_resuls(gpt.generate(\n",
" text=text,\n",
" max_length=len(gpt.tokenizer.encode(text)) + additional_len,\n",
" num_beams=10,\n",
" eos_token_id=gpt.tokenizer.eos_token_id,\n",
" num_return_sequences=1,\n",
" ))[0]"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"def filter_resuls(nr):\n",
" return [x[:x.find(\"<|endoftext|>\")] for x in nr]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Фрегат «Адмирал Макаров» вошел в Средиземное море\\u2028Большой противолодочный корабль «Вице-адмирал Кулак'"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generate(\"Фрегат «Адмирал Макаров»\")"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"name": "ruGPT3XL_generation",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.8"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

0 comments on commit 3f4ee27

Please sign in to comment.