85
91 672

Language Models Don’t Always Say What They Think | Paper summary

5:20

Automatic Prompt Optimization with “Gradient Descent”and Beam Search | Paper summary

10:38

Falcon LLM: the Best Open-source LLM Available at the Moment

4:59

Tree of Thoughts: Deliberate Problem Solving with Large Language Models | Paper summary

10:20

LIMA: Less Is More for Alignment | Paper summary

7:46

Learning to Reason and Memorize with Self-Notes | Paper summary

6:00

QLORA: Efficient Finetuning of Quantized LLMs | Paper summary

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/.
===
Link: arxiv.org/abs/2305.14314
Blog post regarding how to use QLoRa: huggingface.co/blog/4bit-transformers-bitsandbytes
Abstract: We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance. QLoRA backpropagates gradients through a frozen, 4-bit quantized pretrained language model into Low Rank Adapters~(LoRA). Our best model family, which we name Guanaco, outperforms all previous openly released models on the Vicuna benchmark, reaching 99.3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU. QLoRA introduces a number of innovations to save memory without sacrificing performance: (a) 4-bit NormalFloat (NF4), a new data type that is information theoretically optimal for normally distributed weights (b) double quantization to reduce the average memory footprint by quantizing the quantization constants, and (c) paged optimziers to manage memory spikes. We use QLoRA to finetune more than 1,000 models, providing a detailed analysis of instruction following and chatbot performance across 8 instruction datasets, multiple model types (LLaMA, T5), and model scales that would be infeasible to run with regular finetuning (e.g. 33B and 65B parameter models). Our results show that QLoRA finetuning on a small high-quality dataset leads to state-of-the-art results, even when using smaller models than the previous SoTA. We provide a detailed analysis of chatbot performance based on both human and GPT-4 evaluations showing that GPT-4 evaluations are a cheap and reasonable alternative to human evaluation. Furthermore, we find that current chatbot benchmarks are not trustworthy to accurately evaluate the performance levels of chatbots. A lemon-picked analysis demonstrates where Guanaco fails compared to ChatGPT. We release all of our models and code, including CUDA kernels for 4-bit training.
===
Follow us on social media to get regular updates on NLP developments:
LinkedIn: www.linkedin.com/company/86900096/admin/
Weekly NLP newsletter: theglobalnlplab.substack.com/
Twitter: TheGlobalNLPLab
#artificialintelligence #nlp #chatgpt #ml #ai #nlproc #machinelearning

Відео

Language Models Don’t Always Say What They Think | Paper summary

5:20

Language Models Don’t Always Say What They Think | Paper summary

Переглядів 37911 місяців тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Link: arxiv.org/abs/2305.04388 Abstract: Large Language Models (LLMs) can achieve strong performance on many tasks by producing step-by-step reasoning before giving a final output, often referred to as chain-of-thought reasoning (CoT). ...

Automatic Prompt Optimization with “Gradient Descent”and Beam Search | Paper summary

10:38

Automatic Prompt Optimization with “Gradient Descent”and Beam Search | Paper summary

Переглядів 1,2 тис.11 місяців тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Link: arxiv.org/pdf/2305.03495.pdf Abstract: Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-...

Falcon LLM: the Best Open-source LLM Available at the Moment

4:59

Falcon LLM: the Best Open-source LLM Available at the Moment

Переглядів 2,1 тис.11 місяців тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Link: falconllm.tii.ae/ Models link: huggingface.co/tiiuae Abstract: Falcon LLM is a foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens. TII has now released Falcon LLM - a 40B model. The m...

Tree of Thoughts: Deliberate Problem Solving with Large Language Models | Paper summary

10:20

Tree of Thoughts: Deliberate Problem Solving with Large Language Models | Paper summary

Переглядів 923Рік тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Link: arxiv.org/abs/2305.10601 Abstract: Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inf...

LIMA: Less Is More for Alignment | Paper summary

7:46

LIMA: Less Is More for Alignment | Paper summary

Переглядів 969Рік тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Link: arxiv.org/abs/2305.11206 Abstract: Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement lea...

Learning to Reason and Memorize with Self-Notes | Paper summary

6:00

Learning to Reason and Memorize with Self-Notes | Paper summary

Переглядів 278Рік тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Paper title: Learning to Reason and Memorize with Self-Notes Link: arxiv.org/abs/2305.00833 Paper abstract: Large language models have been shown to struggle with limited context memory and multi-step reasoning. We propose a simple meth...

What Will the NLP Industry Look Like in 6-12 Months? with @Slator

7:00

What Will the NLP Industry Look Like in 6-12 Months? with @Slator

Переглядів 275Рік тому

This is a clip from the Slator podcast on which I was a guest. You can watch the full podcast here: ua-cam.com/video/YpjI5F24bbU/v-deo.html Summary: Which types of companies will benefit the most from NLP technology? How will the field look like in 6-12 months? If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Glob...

Who will Win the Large Language Model App Race? with @Slator

9:41

Who will Win the Large Language Model App Race? with @Slator

Переглядів 231Рік тому

This is a clip from the Slator podcast on which I was a guest. You can watch the full podcast here: ua-cam.com/video/YpjI5F24bbU/v-deo.html Summary: we discuss the topic of who will win from the current explosion of LLM use cases and applications? Will it be mainly the big companies, or are we going to see an explosion of startups? If you are looking to add advanced expertise in Natural Languag...

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models | Paper and Demo

6:27

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models | Paper and Demo

Переглядів 623Рік тому

This video was created with Synthesia: www.synthesia.io/?via=nlplab If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Paper link: arxiv.org/abs/2303.04671 Code: github.com/microsoft/visual-chatgpt Abstract: ChatGPT is attracting a cross-field interest as it provides a language interfac...

The rise of API-powered NLP apps: hype cycle, or a new disruptive industry?

8:20

The rise of API-powered NLP apps: hype cycle, or a new disruptive industry?

Переглядів 234Рік тому

This video was created with Synthesia: www.synthesia.io/?via=nlplab If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. What is the disruptive potential of API-powered NLP apps? Are they poised to deliver transformative results to all industries? Or, will their impact be limited to certa...

LLaMA | New open foundation Large Language Model by Meta AI | Paper summary

4:12

LLaMA | New open foundation Large Language Model by Meta AI | Paper summary

Переглядів 3,9 тис.Рік тому

This video was created with Synthesia: www.synthesia.io/?via=nlplab If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Title: LLaMA: Open and Efficient Foundation Language Models Link: research. publications/llama-open-and-efficient-foundation-language-models/ Follow us on s...

ChatGPT: One model for any NLP task? | Paper explained

4:40

ChatGPT: One model for any NLP task? | Paper explained

Переглядів 988Рік тому

This video was created with Synthesia: www.synthesia.io/?via=nlplab If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. Paper title: Is ChatGPT a General-Purpose Natural Language Processing Task Solver? Link: arxiv.org/abs/2302.06476 Abstract: Spurred by advancements in scale, large lang...

The Flan Collection: Open Source Instruction Tuning | Paper explained

5:07

The Flan Collection: Open Source Instruction Tuning | Paper explained

Переглядів 1,9 тис.Рік тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. This video was created with Synthesia: www.synthesia.io/?via=nlplab Papers to read: - Scaling Instruction-Finetuned Language Models (arxiv.org/abs/2210.11416) - The Flan Collection: Designing Data and Methods for Effective Instruction T...

Adaptive Machine Translation with Large Language Models | Paper explained

3:07

Adaptive Machine Translation with Large Language Models | Paper explained

Переглядів 527Рік тому

If you are looking to add advanced expertise in Natural Language Processing to your team, you should check out our services at The Global NLP Lab: nlplab.tech/. This video was created with Synthesia: www.synthesia.io/?via=nlplab Paper Title: Adaptive Machine Translation with Large Language Models Link: arxiv.org/abs/2301.13294 Abstract: Consistency is a key requirement of high-quality translati...

Easy Way to Link a Large Language Model to an External Database | REPLUG Paper explained

3:23

Easy Way to Link a Large Language Model to an External Database | REPLUG Paper explained

Переглядів 879Рік тому

Easy Way to Link a Large Language Model to an External Database | REPLUG Paper explained

6:40

How Close is ChatGPT to Human Experts?

Переглядів 302Рік тому

How Close is ChatGPT to Human Experts?

3:51

Is GPT 3 a Good Data Annotator?

Переглядів 319Рік тому

Is GPT 3 a Good Data Annotator?

When to use a large language model? 4 points to consider in 2023.

7:45

When to use a large language model? 4 points to consider in 2023.

Переглядів 715Рік тому

When to use a large language model? 4 points to consider in 2023.

State-of-the-art Zero-shot Speech Synthesis with Vall-E

3:03

State-of-the-art Zero-shot Speech Synthesis with Vall-E

Переглядів 1,9 тис.Рік тому

State-of-the-art Zero-shot Speech Synthesis with Vall-E

Advanced Reasoning with Large Language Models with Chain of Thought Prompting | Paper explained!

6:12

Advanced Reasoning with Large Language Models with Chain of Thought Prompting | Paper explained!

Переглядів 10 тис.Рік тому

Advanced Reasoning with Large Language Models with Chain of Thought Prompting | Paper explained!

In-context Learning - A New Paradigm in NLP?

10:08

In-context Learning - A New Paradigm in NLP?

Переглядів 6 тис.Рік тому

In-context Learning - A New Paradigm in NLP?

Four Natural Language Processing Research Trends to Watch in 2023

8:30

Four Natural Language Processing Research Trends to Watch in 2023

Переглядів 3,7 тис.Рік тому

Four Natural Language Processing Research Trends to Watch in 2023

Automatic Prompt Tuning for Large Language Models | RLPROMPT paper explained!

7:56

Automatic Prompt Tuning for Large Language Models | RLPROMPT paper explained!

Переглядів 4,9 тис.Рік тому

Automatic Prompt Tuning for Large Language Models | RLPROMPT paper explained!

New Embedding Model by OpenAI - Intro and Explanation

5:33

New Embedding Model by OpenAI - Intro and Explanation

Переглядів 5 тис.Рік тому

New Embedding Model by OpenAI - Intro and Explanation

A Neural Corpus Indexer for Document Retrieval | Paper explained

6:05

A Neural Corpus Indexer for Document Retrieval | Paper explained

Переглядів 522Рік тому

A Neural Corpus Indexer for Document Retrieval | Paper explained

ChatGPT Explained | Optimizing Language Models for Dialogue

5:59

ChatGPT Explained | Optimizing Language Models for Dialogue

Переглядів 2,5 тис.Рік тому

ChatGPT Explained | Optimizing Language Models for Dialogue

Efficient Training of Language Models to Fill in the Middle | Paper summary

9:38

Efficient Training of Language Models to Fill in the Middle | Paper summary

Переглядів 837Рік тому

Efficient Training of Language Models to Fill in the Middle | Paper summary

BLOOM Large-scale Open source Language Model

3:34

BLOOM Large-scale Open source Language Model

Переглядів 418Рік тому

BLOOM Large-scale Open source Language Model

*Paper summary* ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models

6:10

*Paper summary* ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models

Переглядів 698Рік тому

*Paper summary* ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models

КОМЕНТАРІ

@jasonwong8934 13 днів тому
The problem with cot is that it requires the human user to know how to solve that sort of problem and if they knew that, it'd be faster for them to solve the problem themselves. Also it just requires more work from the user.
@flightsim-nl Місяць тому
The claimed advantage that discrete prompt can be understood by humans is questionable if the prompt looks like "ouslyicals downright certainly consistently" (sic) as in the given example.
@MBison-oc5tc 3 місяці тому
are you a robot? is it ai generated?
@rl1111rl 3 місяці тому
Background music is annoying and distracting
@lemark221 8 місяців тому
nothing new, She didnt even bother to run a test herself lol waste of time.
@VijayBhaskarSingh 8 місяців тому
is that a robot speaking?
@GauravKumar-gh5fz 8 місяців тому
hi,, can i get your mailid , i want to run the github repository of it.
@adityaramesh551 9 місяців тому
I would just read a blog instead of a shitty video like this.
@morris5648 9 місяців тому
What no human?
@wangdongsheng2076 9 місяців тому
In the last slides, the first two takeaways contradict with each other. That is, you take an LLM, you get FIM for free, but you also mentioned fine-tuning an existing LM on FIM is less effective than pretraining a LM. It is confusing
@keyvannarimani 6 місяців тому
It means in order to fine tune an exisiting LM on FIM you would need to add three new token embeddings to the model and fine tune it to learn these embeddings. This is less effective than training the model using FIM tokens from scratch.
@WillGilpin 9 місяців тому
UA-cam will add subtitles if we need them. No need to add them to the videos - they are distracting
@best_songs_ever3889 10 місяців тому
is the host also computer generated using AI...!!🤔
@alexisbracav 11 місяців тому
Bonjour, comment on peu l'utiliser aujourd'hui ? faut il l'acheter ou le télécharger ? pour ajouté des voix a nous ?
@computerconcepts3352 Рік тому
Noice 👍
@GaoyuanFanboy123 Рік тому
Thanks for the summary. Interesting how much high quality data matters. But they did classic fune-tuning without LoRA? Obviously they have enough compute power to train the whole model but it would be interesting to know benchmarks for a LoRA trained model. To my knowledge, LoRA can even improve fine-tuning performance (?)
@TheGlobalNLPLab Рік тому
Hello, thanks! Yes, they did classic fine tuning. Good point about LoRA, it would be interesting to see results with that.
@user-wr4yl7tx3w Рік тому
Your explanation of paper is always clear and easy to follow.
@morris5648 Рік тому
If you want to be taken seriously, get rid of the fake person. It is the most comon used.
@hailking5588 Рік тому
Great videos! I have question regarding the data, did they actually added the chain of thoughts to all of the training data, or only some of them?
@JorgetePanete Рік тому
Why would you force subtitles?
@user-wr4yl7tx3w Рік тому
great insights. thanks
@OmkarRahane-sh5vu Рік тому
Did anyone find the person narrating the content to be AI generated, or was that just me perception?
@cmehta994 Рік тому
I want to prompt tune my azure openai gpt35 turbo model with my data. Can you please guide?
@cmehta994 Рік тому
need asap Can you tell if available
@hashhash461 Рік тому
@@cmehta994 how it's going now !
@user-wr4yl7tx3w Рік тому
Why hasn’t ChatGPT use this?
@AaronWacker Рік тому
That is a good thing imho, meaning wow well done! Share details of how you did the lip sync, etc. Nice agent too. Love the style and the coverage of this most important prompt based sidecar student model to a teacher model and presentation of context learning. I've been really using it and RLHF lately and can't believe how performant it is finding solutions to a well described problem. Kudos and keep up great coverage and excellent analysis on making it easier for others. --Much appreciated!! 🍮 yum
@ChaoticNeutralMatt Рік тому
That uncanny valley consistency. So weird.
@yb3134 Рік тому
Great content, indeed will be interesting to see which new players will succeed
@user-wr4yl7tx3w Рік тому
How can we learn about scaling challenges with NLP?
@user-wr4yl7tx3w Рік тому
Truly amazing. But it seems to be the same as OpenAI ChatGPT 4.0. Released yesterday.
@rickrejeleene8298 Рік тому
Great job Nikola!
@TheGlobalNLPLab Рік тому
Thanks, Rick!
@incameet Рік тому
What kinds of tasks encoder-decoder LLMs like Flan-T5 can't do well while decoder-only LLMs like PaLM or GPT3 can do well?
@TheGlobalNLPLab Рік тому
Check out this paper that explores this arxiv.org/pdf/2204.05832.pdf
@incameet Рік тому
This introduction is too basic and its content can be easily found online. Look forward to more in-depth coverage on NLP topics.
@TheGlobalNLPLab Рік тому
Thanks for your feedback! Hopefully I'll have time to make some more in-depth videos soon!
@user-wr4yl7tx3w Рік тому
the quality of the content is quite high. it should be highly recommended by UA-cam.
@enes-the-cat-father Рік тому
Quality content! Please keep up this great work!
@TheGlobalNLPLab Рік тому
Thanks, Enes!
@temanapotaka-dewes9097 Рік тому
CoT prompting seems to be a logical solution to getting the LLM to do what you want it to do. What are the limitations of CoT prompting?
@TheGlobalNLPLab Рік тому
Thanks for the comment! The limitations might be that for some tasks CoT might potentially be overcomplicating / diluting the target problem. Also, CoT increases the input/output sequence lengths, leading to slower inference / greater cost. But it's definitely a great technique to test for your problem!
@incameet Рік тому
Siri is very poor at recognize non-native speakers. Are there any recent models which can recognize accent and understand speakers correctly?
@TheGlobalNLPLab Рік тому
I think there are several companies/startups working on this. I don't have any specific links though.
@incameet Рік тому
VALL-E still needs 3 second sample. Why it is considered as zero-shot?
@TheGlobalNLPLab Рік тому
Thanks for the comment! It's zero shot, because you need to use some data to generate speech like a specific speaker. Otherwise, the problem becomes a general text-to-speech problem.
@incameet Рік тому
B should be read as Billion not as B.
@TheGlobalNLPLab Рік тому
Thanks for the comment! Yeah, that's because the video is actually created with AI! Check out: www.synthesia.io/?via=nlplab
@firelordzaki1600 Рік тому
Bra go read that actual paper it's only 43pgs. Whoops I meant pages! 😬
@ruocaled Рік тому
oh no, you finally became an AI yourself. 🙂 I tested with a few English-Chinese translation base on this method, it works pretty well. The hard part is having a large dababase with related translation examples. But I guess if I am translating a novel, I can only do 1/10 and let AI mimic my style and finish the work.
@TheGlobalNLPLab Рік тому
Thanks for the comment! Haha yes! 😃 Indeed, probably a big benefit will be personalisation over time.
@rohaanmanzoor3268 Рік тому
this video looks AI generated
@MassphereVic Рік тому
it is
@kaushambigujral8172 Рік тому
the ai generated person was so distracting. couldn't watch the video.
@mohamedayman7396 Рік тому
I came here to say the exact same thing
@nft8888 Рік тому
Good content, can I know what app is for video avatar?
@TheGlobalNLPLab Рік тому
Thanks! This is the video avatar www.synthesia.io/
@juanadelossantos5671 Рік тому
Thanks so much. Really helped me!
@andybaker4861 Рік тому
She's a robot, right? She's good. It was her occasionally unnatural inflection that tipped me off.
@TheGlobalNLPLab Рік тому
Yes indeed! I'm using www.synthesia.io/
@tNotimportant Рік тому
It wasn't the inflections that got me it was the unnatural movements.
@jueliang Рік тому
Is this person AI generated?
@hafiz6908 Рік тому
Yup
@TheGlobalNLPLab Рік тому
Yes!
@asiddiqi123 Рік тому
@@TheGlobalNLPLab what tools?
@TheGlobalNLPLab Рік тому
@@asiddiqi123 www.synthesia.io/?via=nlplab
@brunonegraozica5459 Рік тому
I was asking myself the same question!
@ruocaled Рік тому
Ha ! looks like Ai already took the “prompt engineer” s job!😂
@prathams8685 Рік тому
Underrated stuff
@motherofgod8265 Рік тому
where do I access this
@brandomiranda6703 Рік тому
what is the largest size you've seen with current models e.g. A100?
@brandomiranda6703 Рік тому
2048
@gareebmanus2387 Рік тому
Thanks for sharing--very well explained!
@user-fz7db4ls3i Рік тому
Omg, that's what I needed. I got degree in applied linguistics and would like to study nlp. But I don't know where to start. It would be great if you could recommend resources for this, especially for Algebra and math stuff. As I got from job vacancies I will also need to. learn Machine Learning right? I know it's gonna be a long way but I really want to take it
@aamir122a Рік тому
When you say fusion of previous hidden states, with current vector, what do you mean by that, add them, multiply them etc?
@TheGlobalNLPLab Рік тому
Hey, they integrate them in the attention, such that the model attends to both current hidden states and previous hidden states

The NLP Lab

КОМЕНТАРІ