Our Humble Attempt at “How Much Data Is…

Sep 22, 2023

Efficacy of OpenAI Fine-tuning API, Cost/Latency Considerations, Experiencing Catastrophic Forgetting, and...Getting Roasted by a Fine-tuned GPT-3.5

Read →

5 Comments

Max Loeffler

Sep 25, 2023

great work! I'm curious about the latency measurements; did you notice how time-to-first-token affected when streaming responses?

I've also seen some reports that time-to-first-token latency with fine-tuned models is less consistent than vanilla 3.5-turbo; did you observe this?

Expand full comment

Reply (1)

Barry Z

Sep 25, 2023

Hey, great question! We actually redid the study using streaming and found that the fine-tuned model is still consistently faster then the base model.

Though we did notice that the first request tends to be slower so maybe there's a warm-up period.

Expand full comment

Reply (1)

Max Loeffler

Sep 25, 2023

awesome, that makes sense, thank you! do you happen to remember the approximate time-to-first-token for the finetuned model? fwiw, I've been seeing ~0.5-1.5s for 3.5-turbo-4k (and a bit less for 3.5-turbo-16k)

Expand full comment

Reply (1)

Barry Z

Sep 25, 2023

similar but varies a lot! Also I think there are some noise introduced by network speed as well so it might be harder to benchmark it exactly.

Expand full comment

Bryan

Jul 22, 2024

This is great stuff! I am at work on a similar project for custom tone (basically, duplicating my own email tone), and would welcome any more specific thoughts on training data you guys may have. I have been working in the Huggingface ecosystem, but thinking of doing a smallish chatGPT experiment to see how it moves forward.

Expand full comment

The Finest Tuners

Our Humble Attempt at “How Much Data Is…