Discussion about this post

User's avatar
Max Loeffler's avatar

great work! I'm curious about the latency measurements; did you notice how time-to-first-token affected when streaming responses?

I've also seen some reports that time-to-first-token latency with fine-tuned models is less consistent than vanilla 3.5-turbo; did you observe this?

Expand full comment
Bryan's avatar

This is great stuff! I am at work on a similar project for custom tone (basically, duplicating my own email tone), and would welcome any more specific thoughts on training data you guys may have. I have been working in the Huggingface ecosystem, but thinking of doing a smallish chatGPT experiment to see how it moves forward.

Expand full comment
3 more comments...

No posts