Deep dive: model merging, part2

Julien Simon
Jul 29, 2024

--

Model merging is an increasingly popular technique for adding or removing capabilities from transformer models without additional training. In a previous video, we introduced model merging and studied several merging algorithms implemented in the mergekit library (https://github.com/arcee-ai): model soups, SLERP, Task Arithmetic, TIES, DARE, and Franken-merging.

This new video builds upon the previous one and explores new merging methods: model breadcrumbs, model stock, and DELLA. We also quickly look at model merging in Arcee Cloud, which you can run for free as part of the free tier!

--

--

Julien Simon
Julien Simon

No responses yet