In this video, we will be looking at the paper titled "QLORA: Efficient Finetuning of Quantized LLMs" which introduces a new quantization technique called QLoRA that enables the training & fine-tuning of (33B, 13B) LLMs on Consumer GPUs by drastically reducing the Memory Needed. We will be looking at the demo of the model on HuggingFace and then will look at code examples on how to fine-tune it.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
☕ Buy me a Coffee:
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord:
▶️️ Subscribe:
📧 Business Contact: engineerprompt@gmail.com
💼Consulting:
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINKS:
QLoRA Paper:
HuggingFace Blogpost:
QLoRA usage Notebook:
QLoRA fine-tuning Notebook:
Guanaco 33B Demo:
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
All Interesting Videos:
Everything LangChain:
Everything LLM:
Everything Midjourney:
AI Image Generation: