AI models for your enterprise needs: A quick primer on making the right choice

When I say “generative AI,” what’s the first thing that comes to mind? If you said ChatGPT, you’re not alone. One of the most common misconceptions about GenAI is that it’s one thing, a single model that can serve every need.

In reality, there are 255 large-scale models (those with training compute over 1023 floating point operations, essentially the scale of GPT-3 or higher). These can be categorized in dozens of ways, depending on what you need it for, how much you’re willing to spend, how secure you need it to be, etc.

In this blog post, we explore a few of these tangents.

Model scale

Large-scale vs. small-scale or single-purpose micromodels

Let’s begin at the beginning. The primary indicator of an AI model tends to be its scale. Here are models looked at through the lens of scale.

Foundation models

Foundation models tend to be the largest and are trained on enormous data sets. They are characterized by their ability to:

Perform a wide range of tasks, logical, creative, or problem-solving
Generate multimodal output across text, images, code, or video
Be fine-tuned for specific tasks

Examples: Open AI’s GPT, Google’s Gemini, Claude 3.5 Sonnet

Use case: The best use case of these models tends to be in generating content based on natural language input. Writing blog posts, documentation, text summaries, etc. are great for this type of model.

Smaller or single-purpose micromodels

Smaller models tend to be able to do specific tasks well. They are typically overtrained and overfit to perform a specific task. They are:

Trained on fewer than 10 billion parameters
Lightweight, balancing performance and efficiency
Great for on-device applications, like mobile assistants or smart appliances
Useful for domain-specific tasks

Examples: TinyLlama, Mistral Nemo

Use case: Small models are great for specific tasks like code generation, language translation, mobile applications, etc.

Domain-applicability

General purpose vs. domain-specific models

Many AI models are typically built to be general-purpose, as in, they are able to do a wide variety of tasks across use cases. For instance, with a foundation model, you might build a chatbot for customer service, sales, scenario modeling or recommendation engines.

On the other hand, organizations are realizing the need for increased accuracy in processing specific contexts and terminology related to each domain. This resulted in domain-specific models, which are:

Smaller, trained on niche data related to the industry
More accurate in interpreting domain-specific language
Personalizable with greater precision

Examples: Med-PaLM, BloombergGPT, ClimateBERT

Use case: In healthcare, domain-specific models are great for diagnosis. In law, it’s good for contract analysis, compliance review, etc.

Task complexity

General purpose vs. task-specific models

Another dimension similar to domain-specificity is task-specificity. Simply put, a model that is trained for code completion wouldn’t be able to generate videos as well. It’s specific to that particular task.

Task-specific models are characterized by:

Specialization is one particular task or closely related tasks
Excellent accuracy and reduced hallucination in performing that specific task(s)
Lower latency and faster processing
Higher interpretability

Examples: CodeLLama, Veo

Use cases: A great use case for task-specific LLM is in language translation. You can enable distributed teams to converse in different languages with real-time AI-based translation.

Device and deployment

Heavy-weight vs. lightweight models

When we speak of model deployment, we often think of scalable cloud or sprawling on-prem infrastructure. This is true of heavy-weight, powerful models. However, not all use cases lend themselves to having AI deployed this way.

Lightweight models are:

Efficient, using as little computing as possible
Usable in on-device, such as mobile phones or smart devices
Functional despite limited connectivity

Examples: Gemini Nano, Phi-2

Use cases: For low-latency use cases, lightweight models deployed on devices like smartphones are perfect.

Model licensing

Proprietary vs. open source models

In the server operating systems space, open source has over 90% market share. This is a trend likely to duplicate in the LLM space as well.

Proprietary models

A proprietary large language model is developed and maintained by a singular provider, who often charges a subscription fee to use it. Proprietary models often come with regular updates, higher reliability, better performance and support.

You can deploy it in your environment or integrate it through APIs. However, you will not have access to the source code, limiting customization. You might also have limitations on how you use the software.

Examples: Claude 2, GPT-4

Use cases: If you’re just integrating Generative AI into your workflows for a general productivity boost, proprietary models are easier. For instance, Microsoft Copilot is easy to get started if your enterprise is on the Microsoft platform.

Open-source models

In open-source models, the source code and architecture are available for the public to use or modify. These models are community-driven, depending on a wide range of users for development and maintenance. However, it is common for big organizations like Facebook, Google, and OpenAI to develop and maintain open-source LLMs as well.

Open-source models are generally free. The availability of source code helps teams customize, fine-tune, and adapt the code to their needs.

Examples: Meta’s LlaMA, Falcon 180B, TextBlob

Use cases: You can have generalized, domain-specific and task-specific models that are open-source. It’s best when you have the capacity to review, choose, fine-tune and implement open-source LLMs on your own.

With that foundation, here are some pointers on choosing the right GenAI model for your enterprise needs.

If your use case is generalized and performs a wide range of tasks, choose large language models
If your use case is specific and limited to a few similar tasks, choose domain-specific, task-specific or small language models
If you need greater flexibility and customizability, go open-source
If you’re cost-sensitive, choose smaller open-source models
If you don’t have the resources to implement models yourself, choose proprietary models with more straightforward integration and better support

If you’re having trouble making effective and sustainable decisions, speak to Tune AI’s experts. We’ve worked with all kinds of models for use cases across enterprise departments. We’re happy to help.

[Speak to Tune AI today]

When I say “generative AI,” what’s the first thing that comes to mind? If you said ChatGPT, you’re not alone. One of the most common misconceptions about GenAI is that it’s one thing, a single model that can serve every need.

In reality, there are 255 large-scale models (those with training compute over 1023 floating point operations, essentially the scale of GPT-3 or higher). These can be categorized in dozens of ways, depending on what you need it for, how much you’re willing to spend, how secure you need it to be, etc.

In this blog post, we explore a few of these tangents.

Model scale

Large-scale vs. small-scale or single-purpose micromodels

Let’s begin at the beginning. The primary indicator of an AI model tends to be its scale. Here are models looked at through the lens of scale.

Foundation models

Foundation models tend to be the largest and are trained on enormous data sets. They are characterized by their ability to:

Perform a wide range of tasks, logical, creative, or problem-solving
Generate multimodal output across text, images, code, or video
Be fine-tuned for specific tasks

Examples: Open AI’s GPT, Google’s Gemini, Claude 3.5 Sonnet

Use case: The best use case of these models tends to be in generating content based on natural language input. Writing blog posts, documentation, text summaries, etc. are great for this type of model.

Smaller or single-purpose micromodels

Smaller models tend to be able to do specific tasks well. They are typically overtrained and overfit to perform a specific task. They are:

Trained on fewer than 10 billion parameters
Lightweight, balancing performance and efficiency
Great for on-device applications, like mobile assistants or smart appliances
Useful for domain-specific tasks

Examples: TinyLlama, Mistral Nemo

Use case: Small models are great for specific tasks like code generation, language translation, mobile applications, etc.

Domain-applicability

General purpose vs. domain-specific models

Many AI models are typically built to be general-purpose, as in, they are able to do a wide variety of tasks across use cases. For instance, with a foundation model, you might build a chatbot for customer service, sales, scenario modeling or recommendation engines.

On the other hand, organizations are realizing the need for increased accuracy in processing specific contexts and terminology related to each domain. This resulted in domain-specific models, which are:

Smaller, trained on niche data related to the industry
More accurate in interpreting domain-specific language
Personalizable with greater precision

Examples: Med-PaLM, BloombergGPT, ClimateBERT

Use case: In healthcare, domain-specific models are great for diagnosis. In law, it’s good for contract analysis, compliance review, etc.

Task complexity

General purpose vs. task-specific models

Another dimension similar to domain-specificity is task-specificity. Simply put, a model that is trained for code completion wouldn’t be able to generate videos as well. It’s specific to that particular task.

Task-specific models are characterized by:

Specialization is one particular task or closely related tasks
Excellent accuracy and reduced hallucination in performing that specific task(s)
Lower latency and faster processing
Higher interpretability

Examples: CodeLLama, Veo

Use cases: A great use case for task-specific LLM is in language translation. You can enable distributed teams to converse in different languages with real-time AI-based translation.

Device and deployment

Heavy-weight vs. lightweight models

When we speak of model deployment, we often think of scalable cloud or sprawling on-prem infrastructure. This is true of heavy-weight, powerful models. However, not all use cases lend themselves to having AI deployed this way.

Lightweight models are:

Efficient, using as little computing as possible
Usable in on-device, such as mobile phones or smart devices
Functional despite limited connectivity

Examples: Gemini Nano, Phi-2

Use cases: For low-latency use cases, lightweight models deployed on devices like smartphones are perfect.

Model licensing

Proprietary vs. open source models

In the server operating systems space, open source has over 90% market share. This is a trend likely to duplicate in the LLM space as well.

Proprietary models

A proprietary large language model is developed and maintained by a singular provider, who often charges a subscription fee to use it. Proprietary models often come with regular updates, higher reliability, better performance and support.

You can deploy it in your environment or integrate it through APIs. However, you will not have access to the source code, limiting customization. You might also have limitations on how you use the software.

Examples: Claude 2, GPT-4

Use cases: If you’re just integrating Generative AI into your workflows for a general productivity boost, proprietary models are easier. For instance, Microsoft Copilot is easy to get started if your enterprise is on the Microsoft platform.

Open-source models

In open-source models, the source code and architecture are available for the public to use or modify. These models are community-driven, depending on a wide range of users for development and maintenance. However, it is common for big organizations like Facebook, Google, and OpenAI to develop and maintain open-source LLMs as well.

Open-source models are generally free. The availability of source code helps teams customize, fine-tune, and adapt the code to their needs.

Examples: Meta’s LlaMA, Falcon 180B, TextBlob

Use cases: You can have generalized, domain-specific and task-specific models that are open-source. It’s best when you have the capacity to review, choose, fine-tune and implement open-source LLMs on your own.

With that foundation, here are some pointers on choosing the right GenAI model for your enterprise needs.

If your use case is generalized and performs a wide range of tasks, choose large language models
If your use case is specific and limited to a few similar tasks, choose domain-specific, task-specific or small language models
If you need greater flexibility and customizability, go open-source
If you’re cost-sensitive, choose smaller open-source models
If you don’t have the resources to implement models yourself, choose proprietary models with more straightforward integration and better support

If you’re having trouble making effective and sustainable decisions, speak to Tune AI’s experts. We’ve worked with all kinds of models for use cases across enterprise departments. We’re happy to help.

[Speak to Tune AI today]

When I say “generative AI,” what’s the first thing that comes to mind? If you said ChatGPT, you’re not alone. One of the most common misconceptions about GenAI is that it’s one thing, a single model that can serve every need.

In reality, there are 255 large-scale models (those with training compute over 1023 floating point operations, essentially the scale of GPT-3 or higher). These can be categorized in dozens of ways, depending on what you need it for, how much you’re willing to spend, how secure you need it to be, etc.

In this blog post, we explore a few of these tangents.

Model scale

Large-scale vs. small-scale or single-purpose micromodels

Let’s begin at the beginning. The primary indicator of an AI model tends to be its scale. Here are models looked at through the lens of scale.

Foundation models

Foundation models tend to be the largest and are trained on enormous data sets. They are characterized by their ability to:

Perform a wide range of tasks, logical, creative, or problem-solving
Generate multimodal output across text, images, code, or video
Be fine-tuned for specific tasks

Examples: Open AI’s GPT, Google’s Gemini, Claude 3.5 Sonnet

Use case: The best use case of these models tends to be in generating content based on natural language input. Writing blog posts, documentation, text summaries, etc. are great for this type of model.

Smaller or single-purpose micromodels

Smaller models tend to be able to do specific tasks well. They are typically overtrained and overfit to perform a specific task. They are:

Trained on fewer than 10 billion parameters
Lightweight, balancing performance and efficiency
Great for on-device applications, like mobile assistants or smart appliances
Useful for domain-specific tasks

Examples: TinyLlama, Mistral Nemo

Use case: Small models are great for specific tasks like code generation, language translation, mobile applications, etc.

Domain-applicability

General purpose vs. domain-specific models

Many AI models are typically built to be general-purpose, as in, they are able to do a wide variety of tasks across use cases. For instance, with a foundation model, you might build a chatbot for customer service, sales, scenario modeling or recommendation engines.

On the other hand, organizations are realizing the need for increased accuracy in processing specific contexts and terminology related to each domain. This resulted in domain-specific models, which are:

Smaller, trained on niche data related to the industry
More accurate in interpreting domain-specific language
Personalizable with greater precision

Examples: Med-PaLM, BloombergGPT, ClimateBERT

Use case: In healthcare, domain-specific models are great for diagnosis. In law, it’s good for contract analysis, compliance review, etc.

Task complexity

General purpose vs. task-specific models

Another dimension similar to domain-specificity is task-specificity. Simply put, a model that is trained for code completion wouldn’t be able to generate videos as well. It’s specific to that particular task.

Task-specific models are characterized by:

Specialization is one particular task or closely related tasks
Excellent accuracy and reduced hallucination in performing that specific task(s)
Lower latency and faster processing
Higher interpretability

Examples: CodeLLama, Veo

Use cases: A great use case for task-specific LLM is in language translation. You can enable distributed teams to converse in different languages with real-time AI-based translation.

Device and deployment

Heavy-weight vs. lightweight models

When we speak of model deployment, we often think of scalable cloud or sprawling on-prem infrastructure. This is true of heavy-weight, powerful models. However, not all use cases lend themselves to having AI deployed this way.

Lightweight models are:

Efficient, using as little computing as possible
Usable in on-device, such as mobile phones or smart devices
Functional despite limited connectivity

Examples: Gemini Nano, Phi-2

Use cases: For low-latency use cases, lightweight models deployed on devices like smartphones are perfect.

Model licensing

Proprietary vs. open source models

In the server operating systems space, open source has over 90% market share. This is a trend likely to duplicate in the LLM space as well.

Proprietary models

A proprietary large language model is developed and maintained by a singular provider, who often charges a subscription fee to use it. Proprietary models often come with regular updates, higher reliability, better performance and support.

You can deploy it in your environment or integrate it through APIs. However, you will not have access to the source code, limiting customization. You might also have limitations on how you use the software.

Examples: Claude 2, GPT-4

Use cases: If you’re just integrating Generative AI into your workflows for a general productivity boost, proprietary models are easier. For instance, Microsoft Copilot is easy to get started if your enterprise is on the Microsoft platform.

Open-source models

In open-source models, the source code and architecture are available for the public to use or modify. These models are community-driven, depending on a wide range of users for development and maintenance. However, it is common for big organizations like Facebook, Google, and OpenAI to develop and maintain open-source LLMs as well.

Open-source models are generally free. The availability of source code helps teams customize, fine-tune, and adapt the code to their needs.

Examples: Meta’s LlaMA, Falcon 180B, TextBlob

Use cases: You can have generalized, domain-specific and task-specific models that are open-source. It’s best when you have the capacity to review, choose, fine-tune and implement open-source LLMs on your own.

With that foundation, here are some pointers on choosing the right GenAI model for your enterprise needs.

If your use case is generalized and performs a wide range of tasks, choose large language models
If your use case is specific and limited to a few similar tasks, choose domain-specific, task-specific or small language models
If you need greater flexibility and customizability, go open-source
If you’re cost-sensitive, choose smaller open-source models
If you don’t have the resources to implement models yourself, choose proprietary models with more straightforward integration and better support

If you’re having trouble making effective and sustainable decisions, speak to Tune AI’s experts. We’ve worked with all kinds of models for use cases across enterprise departments. We’re happy to help.

[Speak to Tune AI today]