Deploy FALCON-180B Instantly

Deploy FALCON-180B Instantly-The NEW No 1 Open-Source AI Model

Jan 2024


Today we're diving into Falcon 180B which is the latest large language model released by The Technology Innovation Institute in UAE.

The Falcon 180b is actually trained on 3.5 trillion tokens and has a 180 billion parameters making it one of the largest open source language model in terms of performance when we look at the benchmarks it's not just big it's also relatively powerful so the Falcon 180b actually outperforms competitors like Metaslama 2 in a lot of different benchmarks including reasoning and coding proficiency.

It’s also very close in performance to Google Palms 2 large which is actually used to power Bard.

When it comes to its performance against open AIS models it lies between GPT 3.5 and GPT4.

For more context Google Palms 2 has over 350 billion parameters and GPT4 has over 1 trillion parameters.

Both of these numbers are speculated upon and there's a lot of rumors but this is something that a lot of experts also agree on based on what they've heard from a lot of insiders within these two companies.

Given all of that the Falcon 180b despite being smaller than both of these models actually performs relatively well in comparison.

This table right above just reinforces the trend that we've been seeing over the last couple of months to a year in which we see a lot of new open source launch language models actually catching up to state-of-the-art closed Source Live Language models which already exist.

Next up let's actually take a look at the commercial use and Licensing of these models so the Falcon 180b is openly available but it comes with its own specific licensing framework which is based off of Apache 2.0 and this is quite different from the Falcon 40b model.

This time around the Falcon 180b is available for commercial use but there are some restrictions particularly against hosting use.

So, that means if you plan on creating a service which requires you to provide a hosted version of the model you might actually run into some issues when it comes to creating this type of service because there are some restrictions against providing and commercially hosted version of the Falcon 180b model.

You can find the Falcon 180b model easily available on hugging face and it has already been downloaded close to 60,000 times just this past month.

That being said if you compare it with the Falcon 40b model according, the Falcon 40b on the other hand has been used over 12 million times already so we can definitely expect this number to grow as time goes on.

Since the Falcon 180b model is so large there are definitely a lot of Hardware requirements and restrictions when it actually comes to deploying it.

According to the hugging face page to run a full inference of the model in full Precision you need approximately 8 A100s 80 gig cards and that is pretty expensive considering that each of these cars actually retail for about 15,000 US dollars.

That being said, a lot of people might be using Cloud compute services like AWS to run it and you can absolutely run this on things like AWS Sage maker and just to give you an approximate idea to run this for a month it potentially cost about twenty two thousand dollars on 8 AWS Sagemaker to run this.

So this definitely limits the usage of the Falcon 180b to Enterprises or larger organizations which can afford to run this model.

There’s two different types of Falcon 180b models on hugging face as of now.

There’s the base pre-trained Falcon 180b model and then there's also a fine-tuned version for Falcon 180 chat.

In order to run the Falcon 180b model we're going to try out the hugging phase chat demo

Once you're on the main website for this click on try it now in our chat demo.

And once you do that you'll go to this page right here now in order to run this we actually need to duplicate this space as well as get a pro version of hugging face and its token.

What you're going to do is Click duplicate this space and you're going to have to choose a base level hardware and you can actually choose the NVidia A10 G large.

Once you click on that enter your hugging face token right here and then click duplicate space.

I've already went ahead and created a running instance of the Falcon 180b model.

I'm going to first test it out with some very basic text to code generation.

I'm going to ask it to help me create a function in Python which calculates the area of a circle.

In real time this has actually run pretty fast and it did an excellent job of doing exactly what I asked you to do.

Keep in mind that the type of processor you're using greatly determines the type of speed that you're going to be getting from this model.

Let me actually ask it to convert some python code into Java code. So we're going to test that capability.

I asked it to convert a very basic function in Python to Java so this function in Python actually just returns hello and followed by your name so what it has done in Java is actually has created a class called ‘greeter’ as well as two separate functions.

One function actually forms a string hello followed by your name and the other function prints out that sentence.

This is actually really good and it also follows Java programming principles in order to do this as well.

Next we're going to test out how good the Falcon 180b is at handling bugs in code.

What I've said is help me identify a bug in the following code and this code is a very simple division function of dividing A by B. however there is a bug. The fact is that if B is zero this function is going to throw an error and that is exactly what the Falcon 180b has identified.

So the bug is that the function does not handle division by zero which can result in a zero division error. so here's an updated version of the function which handles the error in my experience I've noticed that it is significantly better than the Falcon 40b model and if you want to watch a video on that you can check out the video above for more in-depth explanation on the Falcon 40b model.

That being said the Falcon 180b model is truly amazing especially when it comes to its coding capabilities. This is pretty amazing capabilities for something which is an open source model. we often see these types of results with things like GPT 3. 5 or GPT4 but it's amazing that we are seeing similar results in open source models as well.

Last updated