Deep dive: model merging, part2

Julien Simon
Jul 29, 2024

Model merging is an increasingly popular technique for adding or removing capabilities from transformer models without additional training. In a previous video, we introduced model merging and studied several merging algorithms implemented in the mergekit library (https://github.com/arcee-ai): model soups, SLERP, Task Arithmetic, TIES, DARE, and Franken-merging.

This new video builds upon the previous one and explores new merging methods: model breadcrumbs, model stock, and DELLA. We also quickly look at model merging in Arcee Cloud, which you can run for free as part of the free tier!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Julien Simon
Julien Simon

No responses yet

What are your thoughts?