Mixtral 8x7B – Mixture of Experts DOMINATES …

2023/12/13

Mixtral 8x7B – Mixture of Experts DOMINATES Other Models (Review, Testing, and Tutorial)

MistralAI is at it again. They’ve released an MoE (mixture of experts) model that completely dominates the open-source world. Here’s a breakdown of what they released, plus an installation guide and an LLM test.

* Sorry for the part where my face gets blurry

Download the EdrawMind for Free:https://bit.ly/46xIp8G and SAVE UP TO 40% discount here: https://bit.ly/46nbZgl

Enjoy 🙂

Become a Patron 🔥 – https://patreon.com/MatthewBerman
Join the Discord 💬 – https://discord.gg/xxysSXBxFW
Follow me on Twitter 🧠 – https://twitter.com/matthewberman
Subscribe to my Substack 🗞️ – https://matthewberman.substack.com/
Media/Sponsorship Inquiries 📈 – https://bit.ly/44TC45V
Need AI Consulting? ✅ – https://forwardfuture.ai/
Use RunPod – https://bit.ly/3OtbnQx

Links:
https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

Looks like Mistral has a model that’s even better than Mixtral 8x7B, and they’re serving it to alpha users of their API.

Scoring 8.6 on MT-Bench, it’s frighteningly close to GPT-4, and beats all other models tested.

This is their ‘Medium’ size. ‘Large’ will likely beat GPT-4. pic.twitter.com/jaoXP8lyKl

— Matt Shumer (@mattshumer_) December 11, 2023

And here's a great MoE reading list via @sophiamyang:

– The Sparsely-Gated Mixture-of-Experts Layer (2017): https://t.co/bRUBJYKDQl
– GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (2020) https://t.co/6oWby0QlMX
– MegaBlocks: Efficient Sparse…

— Sebastian Raschka (@rasbt) December 11, 2023

Official post on Mixtral 8x7B: https://t.co/ce0ZjHhLVn

Official PR into vLLM shows the inference code:https://t.co/vJbmDG9RhG

New HuggingFace explainer on MoE very nice:https://t.co/lTaNCONUeI

In naive decoding, performance of a bit above 70B (Llama 2), at inference speed… https://t.co/OMSTfYXVsE

— Andrej Karpathy (@karpathy) December 11, 2023

https://huggingface.co/blog/moe
https://pub.towardsai.net/gpt-4-8-models-in-one-the-secret-is-out-e3d16fd1eee0
https://mistral.ai/news/mixtral-of-experts/

Chapters:
0:00 – About Mixtral 8x7B
9:00 – Installation Guide
13:06 – Mixtral Tests

#EdrawMind #EdrawMindAI #aipresentation #aimindmap