From 629450cecde6383303a1faa1274bde03dac5d0c4 Mon Sep 17 00:00:00 2001 From: Hamel Husain Date: Thu, 21 Mar 2024 22:28:36 -0700 Subject: [PATCH] Bootstrap Hosted Axolotl Docs w/Quarto (#1429) * precommit * mv styes.css * fix links --- .github/workflows/docs.yml | 28 +++++++++++ .gitignore | 3 ++ README.md | 14 +++--- _quarto.yml | 51 +++++++++++++++++++++ devtools/README.md | 2 +- docs/.gitignore | 2 + docs/config.qmd | 17 +++++++ docs/{debugging.md => debugging.qmd} | 6 ++- docs/faq.md | 18 -------- docs/faq.qmd | 21 +++++++++ docs/{fsdp_qlora.md => fsdp_qlora.qmd} | 8 +++- docs/{input_output.md => input_output.qmd} | 5 +- docs/{mac.md => mac.qmd} | 6 ++- docs/{multi-node.md => multi-node.qmd} | 5 +- docs/{multipack.md => multipack.qmd} | 5 +- docs/{nccl.md => nccl.qmd} | 5 +- docs/{rlhf.md => rlhf.qmd} | 5 +- favicon.jpg | Bin 0 -> 4638 bytes index.qmd | 19 ++++++++ styles.css | 1 + 20 files changed, 187 insertions(+), 34 deletions(-) create mode 100644 .github/workflows/docs.yml create mode 100644 _quarto.yml create mode 100644 docs/.gitignore create mode 100644 docs/config.qmd rename docs/{debugging.md => debugging.qmd} (99%) delete mode 100644 docs/faq.md create mode 100644 docs/faq.qmd rename docs/{fsdp_qlora.md => fsdp_qlora.qmd} (92%) rename docs/{input_output.md => input_output.qmd} (98%) rename docs/{mac.md => mac.qmd} (89%) rename docs/{multi-node.md => multi-node.qmd} (95%) rename docs/{multipack.md => multipack.qmd} (92%) rename docs/{nccl.md => nccl.qmd} (98%) rename docs/{rlhf.md => rlhf.qmd} (90%) create mode 100644 favicon.jpg create mode 100644 index.qmd create mode 100644 styles.css diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml new file mode 100644 index 0000000000..a2b797fa2f --- /dev/null +++ b/.github/workflows/docs.yml @@ -0,0 +1,28 @@ +name: Publish Docs +on: + push: + branches: + - main + +permissions: + contents: write + pages: write + +jobs: + build-deploy: + runs-on: ubuntu-latest + steps: + - name: Check out repository + uses: actions/checkout@v4 + - name: Set up Quarto + uses: quarto-dev/quarto-actions/setup@v2 + - name: Setup Python + uses: actions/setup-python@v3 + with: + python-version: '3.10' + - name: Publish to GitHub Pages (and render) + uses: quarto-dev/quarto-actions/publish@v2 + with: + target: gh-pages + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} diff --git a/.gitignore b/.gitignore index 9d6a103dab..589440abf6 100644 --- a/.gitignore +++ b/.gitignore @@ -2,6 +2,7 @@ configs last_run_prepared/ .vscode +_site/ # Byte-compiled / optimized / DLL files __pycache__/ @@ -172,3 +173,5 @@ wandb lora-out/* qlora-out/* mlruns/* + +/.quarto/ diff --git a/README.md b/README.md index eb84a02461..9b5d4cc3fd 100644 --- a/README.md +++ b/README.md @@ -149,7 +149,7 @@ accelerate launch -m axolotl.cli.train https://raw.githubusercontent.com/OpenAcc ``` >[!Tip] -> If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.md#debugging-with-docker). +> If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.qmd#debugging-with-docker).
@@ -267,7 +267,7 @@ Use the below instead of the install method in QuickStart. ``` pip3 install -e '.' ``` -More info: [mac.md](/docs/mac.md) +More info: [mac.md](/docs/mac.qmd) #### Launching on public clouds via SkyPilot To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html): @@ -409,7 +409,7 @@ pretraining_dataset: # hf path only {"segments": [{"label": true|false, "text": "..."}]} ``` -This is a special format that allows you to construct prompts without using templates. This is for advanced users who want more freedom with prompt construction. See [these docs](docs/input_output.md) for more details. +This is a special format that allows you to construct prompts without using templates. This is for advanced users who want more freedom with prompt construction. See [these docs](docs/input_output.qmd) for more details. ##### Conversation @@ -1125,7 +1125,7 @@ fsdp_config: ##### FSDP + QLoRA -Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.md) for more information. +Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.qmd) for more information. ##### Weights & Biases Logging @@ -1204,7 +1204,7 @@ although this will be very slow, and using the config options above are recommen ## Common Errors 🧰 -See also the [FAQ's](./docs/faq.md) and [debugging guide](docs/debugging.md). +See also the [FAQ's](./docs/faq.qmd) and [debugging guide](docs/debugging.qmd). > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it: @@ -1238,7 +1238,7 @@ It's safe to ignore it. > NCCL Timeouts during training -See the [NCCL](docs/nccl.md) guide. +See the [NCCL](docs/nccl.qmd) guide. ### Tokenization Mismatch b/w Inference & Training @@ -1256,7 +1256,7 @@ Having misalignment between your prompts during training and inference can cause ## Debugging Axolotl -See [this debugging guide](docs/debugging.md) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode. +See [this debugging guide](docs/debugging.qmd) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode. ## Need help? 🙋 diff --git a/_quarto.yml b/_quarto.yml new file mode 100644 index 0000000000..31aa90398e --- /dev/null +++ b/_quarto.yml @@ -0,0 +1,51 @@ +project: + type: website + +website: + title: "Axolotl" + description: "Fine-tuning" + favicon: favicon.jpg + navbar: + title: Axolotl + background: dark + pinned: false + collapse: false + tools: + - icon: twitter + href: https://twitter.com/axolotl_ai + - icon: github + href: https://github.com/OpenAccess-AI-Collective/axolotl/ + - icon: discord + href: https://discord.gg/7m9sfhzaf3 + + sidebar: + pinned: true + collapse-level: 2 + style: docked + contents: + - text: Home + href: index.qmd + - section: "How-To Guides" + contents: + # TODO Edit folder structure after we have more docs. + - docs/debugging.qmd + - docs/multipack.qmd + - docs/fdsp_qlora.qmd + - docs/input_output.qmd + - docs/rlhf.qmd + - docs/nccl.qmd + - docs/mac.qmd + - docs/multi-node.qmd + - section: "Reference" + contents: + - docs/config.qmd + - docs/faq.qmd + + + + +format: + html: + theme: materia + css: styles.css + toc: true diff --git a/devtools/README.md b/devtools/README.md index 1d727ed8bb..0114ee3a80 100644 --- a/devtools/README.md +++ b/devtools/README.md @@ -1 +1 @@ -This directory contains example config files that might be useful for debugging. Please see [docs/debugging.md](../docs/debugging.md) for more information. +This directory contains example config files that might be useful for debugging. Please see [docs/debugging.qmd](../docs/debugging.qmd) for more information. diff --git a/docs/.gitignore b/docs/.gitignore new file mode 100644 index 0000000000..4c23a061fa --- /dev/null +++ b/docs/.gitignore @@ -0,0 +1,2 @@ +/.quarto/ +_site/ diff --git a/docs/config.qmd b/docs/config.qmd new file mode 100644 index 0000000000..d93b170e7b --- /dev/null +++ b/docs/config.qmd @@ -0,0 +1,17 @@ +--- +title: Config options +description: A complete list of all configuration options. +--- + +```{python} +#|echo: false +#|output: asis +import re +# Regex pattern to match the YAML block including its code fence +pattern = r']*id="all-yaml-options"[^>]*>.*?All yaml options.*?```yaml(.*?)```.*?
' + +with open('../README.md', 'r') as f: + doc = f.read() +match = re.search(pattern, doc, re.DOTALL) +print("```yaml", match.group(1).strip(), "```", sep="\n") +``` diff --git a/docs/debugging.md b/docs/debugging.qmd similarity index 99% rename from docs/debugging.md rename to docs/debugging.qmd index 59df0b785b..7237fbd6f2 100644 --- a/docs/debugging.md +++ b/docs/debugging.qmd @@ -1,4 +1,8 @@ -# Debugging Axolotl +--- +title: Debugging +description: How to debug Axolotl +--- + This document provides some tips and tricks for debugging Axolotl. It also provides an example configuration for debugging with VSCode. A good debugging setup is essential to understanding how Axolotl code works behind the scenes. diff --git a/docs/faq.md b/docs/faq.md deleted file mode 100644 index 6542306538..0000000000 --- a/docs/faq.md +++ /dev/null @@ -1,18 +0,0 @@ -# Axolotl FAQ's - - -> The trainer stopped and hasn't progressed in several minutes. - -Usually an issue with the GPU's communicating with each other. See the [NCCL doc](../docs/nccl.md) - -> Exitcode -9 - -This usually happens when you run out of system RAM. - -> Exitcode -7 while using deepspeed - -Try upgrading deepspeed w: `pip install -U deepspeed` - -> AttributeError: 'DummyOptim' object has no attribute 'step' - -You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli. diff --git a/docs/faq.qmd b/docs/faq.qmd new file mode 100644 index 0000000000..91413d24e9 --- /dev/null +++ b/docs/faq.qmd @@ -0,0 +1,21 @@ +--- +title: FAQ +description: Frequently asked questions +--- + + +**Q: The trainer stopped and hasn't progressed in several minutes.** + +> A: Usually an issue with the GPUs communicating with each other. See the [NCCL doc](nccl.qmd) + +**Q: Exitcode -9** + +> A: This usually happens when you run out of system RAM. + +**Q: Exitcode -7 while using deepspeed** + +> A: Try upgrading deepspeed w: `pip install -U deepspeed` + +**Q: AttributeError: 'DummyOptim' object has no attribute 'step'** + +> A: You may be using deepspeed with single gpu. Please don't set `deepspeed:` in yaml or cli. diff --git a/docs/fsdp_qlora.md b/docs/fsdp_qlora.qmd similarity index 92% rename from docs/fsdp_qlora.md rename to docs/fsdp_qlora.qmd index 14b2c1a571..69b4ad4454 100644 --- a/docs/fsdp_qlora.md +++ b/docs/fsdp_qlora.qmd @@ -1,4 +1,10 @@ -# FDSP + QLoRA +--- +title: FDSP + QLoRA +description: Use FSDP with QLoRA to fine-tune large LLMs on consumer GPUs. +format: + html: + toc: true +--- ## Background diff --git a/docs/input_output.md b/docs/input_output.qmd similarity index 98% rename from docs/input_output.md rename to docs/input_output.qmd index dbc6979c6f..4e2ea1345f 100644 --- a/docs/input_output.md +++ b/docs/input_output.qmd @@ -1,4 +1,7 @@ -# Template-free prompt construction with the `input_output` format +--- +title: Template-free prompt construction +description: "Template-free prompt construction with the `input_output` format" +--- diff --git a/docs/mac.md b/docs/mac.qmd similarity index 89% rename from docs/mac.md rename to docs/mac.qmd index 59eacce6d0..2a83035381 100644 --- a/docs/mac.md +++ b/docs/mac.qmd @@ -1,8 +1,12 @@ -# Mac M series support +--- +title: Mac M-series +description: Mac M-series support +--- Currently Axolotl on Mac is partially usable, many of the dependencies of Axolotl including Pytorch do not support MPS or have incomplete support. Current support: + - [x] Support for all models - [x] Full training of models - [x] LoRA training diff --git a/docs/multi-node.md b/docs/multi-node.qmd similarity index 95% rename from docs/multi-node.md rename to docs/multi-node.qmd index 6806159690..5c6fa976b9 100644 --- a/docs/multi-node.md +++ b/docs/multi-node.qmd @@ -1,4 +1,7 @@ -# Multi Node +--- +title: Multi Node +description: How to use Axolotl on multiple machines +--- You will need to create a configuration for accelerate, either by using `accelerate config` and follow the instructions or you can use one of the preset below: diff --git a/docs/multipack.md b/docs/multipack.qmd similarity index 92% rename from docs/multipack.md rename to docs/multipack.qmd index bee13b62c3..097bcd2e50 100644 --- a/docs/multipack.md +++ b/docs/multipack.qmd @@ -1,4 +1,7 @@ -# Multipack (Sample Packing) +--- +title: Multipack (Sample Packing) +description: Multipack is a technique to pack multiple sequences into a single batch to increase training throughput. +--- ## Visualization of Multipack with Flash Attention diff --git a/docs/nccl.md b/docs/nccl.qmd similarity index 98% rename from docs/nccl.md rename to docs/nccl.qmd index 4a7ff5d5d6..3b616aa665 100644 --- a/docs/nccl.md +++ b/docs/nccl.qmd @@ -1,4 +1,7 @@ -# NCCL +--- +title: NCCL +description: Troubleshooting NCCL issues +--- NVIDIA NCCL is a library to facilitate and optimize multi-GPU communication operations, such as broadcast, all-gather, reduce, all-reduce, etc. Broadly, NCCL configuration is highly environment-specific and is configured via several [environment variables](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html). A common NCCL-related problem occurs when a long-running operation times out causing the training process to abort: diff --git a/docs/rlhf.md b/docs/rlhf.qmd similarity index 90% rename from docs/rlhf.md rename to docs/rlhf.qmd index 4f71184fc0..7db68915ad 100644 --- a/docs/rlhf.md +++ b/docs/rlhf.qmd @@ -1,4 +1,7 @@ -# RLHF (Beta) +--- +title: "RLHF (Beta)" +description: "Reinforcement Learning from Human Feedback is a method whereby a language model is optimized from data using human feedback." +--- ### Overview diff --git a/favicon.jpg b/favicon.jpg new file mode 100644 index 0000000000000000000000000000000000000000..43c69024430555f849ee0077664f194dae8d912f GIT binary patch literal 4638 zcmbtU2UJsAmp(~oB8s#J2m%U-3W6d_R2}HU`SIQHCC{>CCktQHf zX?7(vAySfnQUnuJ3^j8>N9SKNYt6rAzI$`dclX(Q-~H`#ZW!+wB=ABf+|vyJV`Fdx z0Kf)VVB7!!F&K0K*j~WA%>g(E6Z*{^V26JBzyW~s{P9on1WdpDp`6>$|MmQHX3YeE z6|yiLlb1i91^dZia0IyhpZ4R&j@^d+$#@(7JEQ(>#2;SD^2=xYC0hf748Lult&KFZ zFx53Us{>$A{qW4o&OSj*#{ls04Gy%>(-N~iZzuLU1R{>kLH?)B%+CH-+rRSVp{-5; z(0_C**T1g)uTMK&T!WmUKC(fcC!PHRgCYJm#7)A2{kJ&O4#4Z|;phVKQi#h0LIpzn z-Im?yC!gBl?mszwiw9a+XhYgKATH+qlgn&zr=NUFi0kDz8V5}80 z7#k%3Al?J;IQvh$UpCavMaZ7{$0kt-0B0NkO&x!1PGtZzLp>H4_je3*+0GJ+VApv7c%WXh+y8&pfA$;7y_Fx73FQjCS6UwcKt?nPHx^mH}i{2 zO3TVC?p9XSKWJ!dYHoSh`n0pFyQlZr^B08vfx)5S_ambuGG%J|kFlfUNTtE|S-c^gLo>A7= z*A!>dsH|&rb%Ek4I+uz+?4+mD%hGX|j;isr^vKjH-VKZs+K~ZLXtFRzySI^O(Jj4c#MBw?#e2-JOBBV z&55cx!z8f{!Y=c{@63m8_)aN}tGX;x$^tg`57Z8^x!g4DvZ_-0fJkpqr`g%`j6ENn z80^r^J~q_&_DstF;;!SnD_(g-oq*6t+^q`nW;N%f@{LE;iMF|Zd*xg(jCX7Nq>8ST zx7_HHvCHb(QDX$PH&%a1mKVqhwjOrSEQ>CUej&n6#mC6X)w*{0I6K_>*c!55JXPds zd>nQ;11HSBd7*t5&EeDIz59x8mOL8{d{*MM8)SEw+wW_^&BiTT5?rax@dx;A4t|^q zck;YI`63_t7zOuUCwEU-Jawj=l2%$+TGG57v_q$_U4t@@WRLF4^e#H~h{~T7{2bG8 z*>$IXGqO#e;`+=@TuNo~(42~}4RM^mo=SO|_ajrL*>j+Z0y{UA`wQft38bWSs zSxAWMjOo5h{;w@o3t0_irHz`txJCsN)}%dBHdWrKDXXUqF5lSGFd)LiZOSEfSM&3Y z3@R?SgUdTwesJBr+M{)WEVN59hRfxq>|SdogY;NM6^F=H%I#XI<)V$c4uN&1Yp>g* zDkWp*ly|!O6zI|O!=&mE`AbKGD7APJZs!NgQ#SKb8iIJW?1CR`b6z(}|0HWu zcn$qWC%QFU(5cSMtiUqN@HtMW;%(4u3xDGA#=8EOl|5>MHO13@Vb=FnQI(td*y7ef zs=c+!<8z`30V0w#jxQBY`~svE%og9@oOCC&FaYOyrI<-+V~yGKvd#%-M>3zCP0C$K zKecvF@2c+t=^eh|J1V=rQSgA?RFi7F!@Z1Wk;6+S7mfImtL{1;AYLDdZu49bEgX7j z6(Gl{H~yf$KRVy;lQ(J156z+(EnT`v_<_q)Q>3KWZ`ekqtBA$$_U^e+%`x?`SJ3a> zCH8N4x;|#KM6p6x(a#apGPBUQN_-anYG;{wKdOY$M^9PCbWm#_MMRJdreAtm5ZQY7 z>A#t*5PZH8RlSOAVu1KwTY7+ZFEqDfwUh(_7m_tmpdz<;nJLWXlMFExm4KO{xWn4Ev^&E z-0nE}q-tGnfTz=pn9JoA@FG*qW&M?gzs04amFwu2 z%L(ulh708w25|cPz2fj9w^CxU=8=ubt>E{Y*PdLe zQ!^<#*zRg9g=;IDSep0ttd9&?WB@z>{-kaKV45rSC08V%_jH~^{ zBR@I`^Vh?xx)pVEn3Ak+7Dps)oXC_uvY`3+@3&^&pYI&`=NeC0@|vATD>}pOf+V{z zz2GRpVJ@`oGtNiDUVK%%?=$h-piF~t@MX!^aj{FG+02DuhrkqPPBu8r#J*zGr6Gf- zMYmPX=gldVE_HIt!nJwN6zg80!mpf!KjI}z8*=$Ip!NXk`< zU~!pbfO;Vgb=BdQxvE5@SL?tzFG5U&EPE;?S|D|Ip*L;rrrs((&uPw_POM+NQGc;% z*pJTobx2gUXt!-uL6fIMo7eZ|Jd@Q3!B+Rw*BuWYVrA-s)~gA4(xr+Y^CVGT@79c@ zcH;oP&lZ30bTz6pi%u$YXMm@@n3dVp;jh-V)`7mLBE_?f&7l|uf!@FDOFvICvtZX*`DSGkRBr zgROmd=;kVcJn=HDqakTeEoNxHz$NeDjD?H0kwPdRt=U(h@6#0uZ`QS=TNd)C$rEhT z>D_CF{wW3_&C@=drU)|AVbXwoz|st=w?}hCg8?Y%g`Qy*I{O+MYjnhAXa3@wZ`@NS zQ8Y!7y6*k;fguCLp0=mIwHo@4Ov$HsNZ5bwE2%qtsxzqI)~&)C=WyDR4ES`tJ7%PL z5ziSty2;-c+?0@~lOuV<1x_Bj1`qF4Nau=lmg1vexf#GW{L=>U^V^g0y2YU64WKAu5%@QU+MJ+=I4xDavGKzGiJv3(4@+rz_1)O zX*q!&b=&sC`|ybcZ;Y>I-zZl?R+3JZJ+r*Ju1d`pqWJPM10)QVT%ey*H<6xd6g`~x zIwY^@S;tb7>N?S8->zs+I+3j4pfpf}tGOr9(5An?`C-d-%7SUYRBd-wZw`;bKWlHg z#~z#6PxGyQ+HvWDD&2Od$6^YbN7@ko(SD;VlD^l7XdLutB^ZfYi4Yyee8#RB(M7u$ zU{|}gkK^%hwRiWE3wmWd?kZ`FbqC^J6!`10vhsaj&Y_}^xvR!DU7OOmt8Z4d!(@(~ z@p+XY#`?+CIqTa#RVO^aPv(entPA1nkfNe=qB-GS@k8a@+yJ`MoK^n0G^ebJn+y;- zdta-@clwxyHEaGx{~+3YG9z)-#XQYz(MMK^t;gQdD{KFpbLz{(AlORa@#G~NEY{_G zn%y;9_u995tGH+MXjO`hYI-eFdG8JSqk*C4(gydG)->;Qp?z;W)esus(YtS(K9p`Q z7@?HODPE*VdJ*`Z}Wji7gODT$m?Ep@LVi^{xxN{|D8zT6*N|GxN}vcghVVa<{;j{-Vx)L1d(tu~1FD90PL;wTSO;>zX{3r^QG8IT;-G~q2 z+&Gkruc#ZuX}S;^+aiigmt6&A-7@N;Pt75PJNBUEsS{m(g7)?!o<*|Bqus^r4G(3_ zwXv$>hxbkTRQPirnL^Q!xmX^I7w6Tz-N}8H7Tzdj|9%~I-M@?<$+OewPIw3WWC?c3 z6U)i~3l*DLM=_@~WEjA9!fvG?W2YX!DbYIi(S*sXjjF^0!F98zzbPuJhnWnHlwbEw z3j0(AG!5rB`RO*DtLbfUZ@aYD@!YBLaevgyh<#$JTdj zKE$vuc9XqX^MLqmk#ckON#tMf1>3t7GW$LeV z!&ZXQM{pJBi)-gOuNf_;>|!-0ur=w6j6Y>=NEp$3MLZxOTok`fCnYP=uurioJUGn! z7(U0H&RG(*3!P)g6_}0Hp7$C})3+?G$#-LbZw kXV&c*K;+)H&7A3wFC?0!zg{OL@T_t}^q22Tnv9YE0Lfy+e*gdg literal 0 HcmV?d00001 diff --git a/index.qmd b/index.qmd new file mode 100644 index 0000000000..87d6858808 --- /dev/null +++ b/index.qmd @@ -0,0 +1,19 @@ + + +```{python} +#|output: asis +#|echo: false + +# This cell steals the README as the home page for now, but excludes the table of contents (quarto adds its own) +import re +pattern = re.compile( + r"\s*\s*\s*\s*
\s*## Table of Contents.*?
", + re.DOTALL | re.IGNORECASE +) + +with open('README.md', 'r') as f: + txt = f.read() + +cleaned = pattern.sub("", txt) +print(cleaned) +``` diff --git a/styles.css b/styles.css new file mode 100644 index 0000000000..2ddf50c7b4 --- /dev/null +++ b/styles.css @@ -0,0 +1 @@ +/* css styles */