Open Source AI Finder

Discover the latest open-source models for your projects.

OmniHuman

text-to-image

An AI model from ByteDance for generating realistic human portraits. It allows for fine-grained control over attributes like pose, expression, and style, and can integrate subjects into different backgrounds.

portrait generationcharacter designvirtual avatarsfashion modeling

Suno V3.5

text-to-audio

An AI model for generating music from text prompts. Version 3.5 introduces features like extended song length (up to four minutes) and improved audio quality, allowing users to create full songs.

music generationsong creationbackground music productionprototyping song ideas

ChatGPT Pulse

text-generation

A version of ChatGPT tailored for enterprise customers, offered as part of the ChatGPT Enterprise plan. It provides a customized and secure AI assistant that can be tailored with business-specific data.

internal business operationsenterprise knowledge basedata analysis for businesscustomized workflows

Kimi Chat

text-generation

An AI chatbot from Moonshot AI, notable for its extremely long context window (up to 2 million tokens). It is designed for deep document analysis and comprehension of large amounts of text.

document analysislong-form content summarizationresearchchat with large context

Qwen2-Coder

text-generation

A code-specialized large language model from the Qwen2 series. It is pre-trained on a vast amount of code data and excels at code completion, bug fixing, and other programming-related tasks.

code generationcode completiondebuggingsoftware development

Qwen-VL

multimodal

An open-source series of large vision-language models (LVLM) from Alibaba Cloud. They support capabilities like multiple image understanding, Chinese/English OCR, and fine-grained visual localization.

image captioningvisual question answeringobject detection via textocr

Qwen-VL-Max

multimodal

A proprietary, closed-source large vision-language model (LVLM) from Alibaba Cloud. It demonstrates state-of-the-art performance, outperforming models like GPT-4V and Gemini Ultra in several benchmarks.

image question answeringvisual reasoningdocument analysisocr

Qwen2

text-generation

A series of open-source large language models by Alibaba Cloud, ranging from 0.5B to 72B parameters. The models show strong performance, particularly in coding and mathematics, and support a long context length up to 128k tokens.

chatbotscode generationmathematical problem solvingsummarizationtranslation

A lightweight, fast, and cost-efficient multimodal model from Google. It features a 1 million token context window and is optimized for high-volume, high-frequency tasks where low latency is critical.

chat applicationsdata analysiscontent summarizationreal-time multimodal reasoning

OmniInsert

video-to-video

A video editing model that allows for the insertion of custom objects into existing videos. By providing a mask and a text prompt, it can generate and seamlessly integrate new objects while maintaining consistency with the video's lighting, perspective, and motion.

video editingvisual effects (VFX)product placementcreative video modification
Scroll to top