Installing Mixtral 8X-7B
Last updated
Last updated
Mixtral 8X-7B -This new AI is powerful and uncensored
Fireship
The Code Report
2.74M subscribers
Transcript:
GPT4, Grock and Gemini all have one thing in common. They’re not free and I don't mean free as in money but free is in Freedom. Not only are they censored and aligned with certain political ideologies but they're closed source which means we can't use our developer superpowers to fix these problems.
Luckily though, there is hope thanks to a brand new Open Source Foundation Model named Mixtral 8X-7B which can be combined with the brain of a dolphin to obey any command by the end of this video you'll know how to run uncensored Large Language Models on your local machine with performance approaching GPT4 and also to fine-tune them with your own data making AI so free that its mere existence is an act of rebellion.
It is December 18th 2023 and you watching the code report a few months ago OpenAi CEO Sam Altman said that it's probably impossible for any startup to compete with open AI it's totally hopeless to compete with us on training Foundation models you shouldn't try and it's your job to like try anyway I think it I think it is pretty hopeless.
However, last week when Google announced Gemini, a French company Mistral simultaneously dropped a torrent link to their brand new Apache 2.0 license model Mixtral. The company behind it Mistral has been around for less than a year and has already valued at $2 billion.
It’s based on a mixture of experts architecture which is rumored to be the secret sauce behind GPT4.
Now it's not at GPT 4's level yet but it outperforms GPT 3.5 and LlaMa2 on most benchmarks. it's very powerful but most importantly it has a true open source license Apache 2.0, allowing you to modify and make money from it with minimal restrictions. This differs from Meta's LlaMa2 which has often been called Open Source, but that's not entirely accurate because it has additional caveats that protect Meta.
But despite all the horrible things Meta has done over the years, they've done more to make AI open than any other big tech company. The problem though is that both Llama and Mixtral are both highly censored and ‘aligned’ out of the box. Now that's probably a good thing if you're building a customer facing product, but it's utterly impractical when trying to overthrow the shape-shifting lizard overlords of the New World Order.
Luckily it is possible to un-lobotomize these AI’s. There’s a great blog post by Eric Hartford that explains how uncensored models work and their valid use cases. He’s the creator of the Mixtral dolphin model, which not only improved its coding ability but also uncensored it by filtering the data set to remove alignment and bias.
As you can see here, I'm running it on my machine locally and it's teaching me all kinds of cool new skills like how to cook or how to do with a horse. and it even improved my coding skills by teaching me how to infect a Windows machine with a key logger in Python. Pretty cool.
So let's talk about how you can run it locally too. there are many different options like the oobabooga/text-generation-webui: A Gradio Web UI for LLMs, but my personal favorite is an open source tool called Olama which is written in go, and makes it super easy to download and run open source models locally. it can be installed with a single command on Linux or Mac and you can run it on windows with WSL2 like I'm doing here.
(WSL2, which stands for Windows Subsystem for Linux 2, is a feature of Windows 10 and 11 that allows you to run a real Linux environment directly within your Windows system. This means you can access and use all of your favorite Linux tools and commands, even on a Windows machine, without having to dual-boot or use a virtual machine.)
Once installed all you have to do is run Olamaserve then pull up a separate terminal and then use the Run command for a specific model it supports the most popular open source models like Mixtral and LlaMa2, but what we're looking for is Dolphin Mixtral uncensored keep in mind it needs to download the model which is about 26 GB.
In addition to actually run the model you'll need a machine that has a good amount of ram in my case I have 64 GB and it takes up about 40GB of them when running this model.
To use it you simply prompt it from the command line and now you have a powerful LLM without the normal safety guards. that's pretty cool. But what if you want to take things a step further and find tuna model with your own data. Sounds complicated but it's actually easier than you think when using a tool like hugging face AutoTrain.
To use it you simply create a new space on hugging face and choose the docker image for AutoTrain that will bring up a UI where you can choose a base model.
Not only can It handle LLMs but it can also do image models like stable diffusion. I'd recommend choosing one from world-renowned Large Language Model trainer The Bloke. Now it is possible to run AutoTrain locally but you probably don't have enough GPU power. However, you can rent Hardware in the cloud from hugging phase I'm not sponsored or affiliated with them.
And you can also do stuff like this with AWS Bedrock and Google Vertex AI. To give you some perspective the Mixtral dolphin model took about 3 Days to train on four A100's. You can rent A100’s on Huggingface for $4.3 per hour. Four of these x 3 days comes out to about $1,200.
The final step is to upload some training data. The format will typically contain a prompt and response. And to make it uncensored you'll need to urge it to comply with any request even if that request is unethical or immoral. You might also want to throw in a bunch of esoteric content from banned books and the dark web.
Go ahead and upload the training data click 'start training' and a few days later you should have your own custom and highly obedient model.
Congratulations you're now the last beacon of hope in this fight against our metamorphic lizard overlords.