Advertisement

Stable Diffusion WebUI Forge

It was once an enhanced version of the most popular SD WebUI, offering a 60% improvement in memory efficiency compared to the original version.

Image & Video⭐ 8.5kGPL-3.0

Chatterbox (Resemble AI)

An MIT-licensed open-source TTS model built with a Linux-based framework, outperforming ElevenLabs in blind tests.

Voice & TTS⭐ 12.5kMIT

Wav2Lip

A classic lip-sync solution that combines any video with any audio to deliver accurate mouth movements, serving as the underlying engine for numerous projects.

Digital Human⭐ 11.0kMIT

SillyTavern

A visual front-end for immersive Chat & Role-Play experiences, featuring a character card system, a world database, group chat functionality, and Voice & TTS voice synthesis.

Chat⭐ 12.0kAGPL-3.0

CosyVoice (Alibaba)

It delivers the best results in Chinese—supporting dialects, emotion control, and stream-based inference, with a latency as low as 150ms.

Voice & TTS⭐ 13.7kApache-2.0

AnimaHub

An AI-powered system for creating AI Comics — featuring a workflow that includes importing novels, breaking down scripts, creating characters, and synthesizing videos.

Comics⭐ 7.8kApache-2.0

DiffRhythm

This tool enables the creation of a full 4 minutes and 45 seconds song in just 10 seconds, featuring precise alignment with LRC lyrics and outstanding speed.

Music & Audio⭐ 2.3kApache-2.0

Tortoise TTS

High-quality neural text-to-speech that delivers excellent sound quality at the cost of slower speed, making it suitable for offline high-fidelity voice dubbing.

Voice & TTS⭐ 14.0kApache-2.0

browser-use

This feature enables the Agent & Automation to operate a real browser, handling form filling, web scraping, and navigation entirely automatically.

Agent⭐ 97.0kMIT

LibreChat

An enterprise-grade multimodal Chat & Role-Play platform featuring multi-user support, RBAC permissions, and audit logs, designed for team deployment via Local Deployment.

Chat⭐ 21.0kMIT

MaxKB

Designed with Chinese users in mind — it supports one-line Local Deployment via Docker and zero-code integration into business systems.

Knowledge Base⭐ 20.3kGPL-3.0

Notex

The open-source version of NotebookLM supports multiple file formats including PDF, DOCX, and PPTX, along with features such as intelligent Q&A, PPT generation, and mind map creation.

Office⭐ 4.8kMIT

ChatGPT-on-WeChat (CowAgent)

It was once a benchmark in the Chinese AI Agent community, supporting integration across multiple platforms including WeChat, WeCom, DingTalk, and Lark.

Agent⭐ 45.0kMIT

GFPGAN

A facial restoration algorithm developed by Tencent, serving as a classic tool for restoring old photos and making precise adjustments to faces in AI-generated images.

Image & Video⭐ 37.0kApache-2.0

AnimateDiff

Bring Motion to SD — Text-to-Video/Image-to-Video with Motion LoRA to control the intensity of movements

Image & Video⭐ 13.0kApache-2.0

Fish Speech

High-quality multi-language Text-to-Speech — powered by the VITS2 architecture, supporting over 8 languages with audio quality close to that of human speech.

Voice & TTS⭐ 10.0kBSD-3-Clause

DeepPrintFilm

Frame-driven AI Comics generation — precise control over keyframes combined with AI-based interpolation for intermediate frames to ensure smooth frame rates.

Comics⭐ 7.2kMIT

InternLM2.5 (Shanghai AI Lab)

Produced by the Shanghai AI Laboratory — featuring strong inference and mathematical capabilities, along with a very long context length, making it the top choice for academic research.

Open Models⭐ 18.0kApache-2.0

Rowboat

A powerful solution for meeting scenarios — it connects to Gmail and calendars to build a Knowledge Base & RAG, automatically preparing meeting summaries.

Office⭐ 4.2kMIT

KrillinAI

A complete solution for video translation and voiceover — featuring AI-powered subtitle translation, synchronized voiceover addition, and one-click publishing, making it an excellent tool for businesses expanding overseas.

Video Tools⭐ 5.0kMIT

LocalAI

A substitute for the OpenAI API — it runs entirely on CPU and supports multi-model routing, RAG, as well as function calls.

Local Deploy⭐ 28.0kMIT

ChatALL

Ask questions to more than 10 different AIs at the same time and compare their responses side by side — using one-click batch sending to help identify the best answer.

Chat⭐ 16.4kApache-2.0

EchoMimicV2

Developed by Ant Group/DAMO Academy — an audio-driven half-body animation feature that synchronizes the head, gestures, and upper body, making it the top choice for virtual streamers.

Digital Human⭐ 4.6kApache-2.0

Open Cowork

A desktop-based virtual colleague that allows users to operate a computer just like a human: view the screen, click buttons, fill out forms, export data, and send it to Lark.

Office⭐ 1.5kMIT

Jan

A stylish and cross-platform desktop AI platform that integrates the MCP protocol along with a rich ecosystem of plugins.

Local Deploy⭐ 22.0kAGPL-3.0

Cline

A fully self-hosted Agent & Automation extension for VS Code that enables end-to-end execution of tasks such as reading files, writing code, running commands, and browsing the web.

Programming⭐ 25.0kApache-2.0

LangChain + LangGraph

The industry-standard 127K★ LLM application — LangGraph 1.0 official version, featuring a stateful schema-based Agent orchestration engine.

Agent⭐ 127.0kMIT

Sana (NVIDIA)

Created by NVIDIA — 4K high-resolution Image & Video Generation for outstanding image quality.

Image & Video⭐ 7.5kApache-2.0

Jan (Chat Mode)

The chat & role-play mode of this desktop AI platform features a Local Deployment setup using models combined with the Cortex engine, ensuring zero latency for offline conversations.

Chat⭐ 22.0kAGPL-3.0

GraphRAG (Microsoft)

Developed by Microsoft — an RAG solution enhanced by knowledge graphs, enabling entity-relation inference along with global understanding capabilities.

Knowledge Base⭐ 20.0kMIT

F5-TTS

Ultra-fast cloning in just 2 seconds — featuring the fastest inference speed (RTF of 0.15), with the MIT protocol available for commercial use.

Voice & TTS⭐ 12.0kMIT

AIComicBuilder

An AI-powered tool for creating AI Comics — integrating script development, character design, storyboard creation, and Image & Video Generation into a single workflow.

Comics⭐ 6.5kMIT

Yi-1.5 (01.AI)

The best choice for writing long Chinese texts — available in 6B to 34B parameter sizes, with the 34B version offering exceptional value for money.

Open Models⭐ 12.0kYi License

LatentSync (ByteDance)

ByteDance – Subspace Diffusion Model for lip synchronization, delivering high-precision mouth shape matching.

Digital Human⭐ 9.0kApache-2.0

AionUi

A unified desktop interface for various AI tools — it encapsulates the CLI interfaces of models such as Gemini, Claude, and Qwen, along with file management and Excel processing capabilities.

Office⭐ 3.5kMIT

Langflow

LangChain’s drag-and-drop user interface enables low-code development of RAG and multi-step Agent & Automation workflows.

Agent⭐ 32.0kMIT

Text Generation WebUI (Oobabooga)

A comprehensive full-featured WebUI for developers — supporting LoRA fine-tuning, a rich plugin ecosystem, and compatibility with HuggingFace.

Local Deploy⭐ 20.0kAGPL-3.0

Roo Code

An AI development team simulator that enables multiple Agent & Automation instances to work in parallel, allowing a single user to oversee and coordinate an entire AI programming team.

Programming⭐ 21.0kApache-2.0

FastGPT

A visual workflow orchestration platform for optimizing Chinese-language Knowledge Base & RAG systems, featuring outbound Chat & Role-Play capabilities as well as integration with WeCom.

Knowledge Base⭐ 15.0kApache-2.0

HeyGem

Create a digital avatar in just 30 seconds — available in 8 languages with offline functionality, making it accessible even for complete beginners.

Digital Human⭐ 8.5kApache-2.0

SongGen

One-step text-to-song conversion — based on a paper presented at ICML 2025, featuring an extremely simplified workflow.

Music & Audio⭐ 310MIT

Kimi K2.5

A comprehensive Office solution that enables lossless conversion between PPT, Word, Excel, and PDF files. It supports up to 100 Agent & Automation instances working in parallel to handle tasks involving as many as 1,500 steps.

Office⭐ 3.8kApache-2.0

AutoClip

Agent & Automation-powered highlight editing — featuring AI-driven identification of standout moments, automatic scoring, and batch processing, making it an excellent tool for slicing live streams.

Video Tools⭐ 3.3kMIT

AnythingLLM

A personal Knowledge Base & RAG workspace that enables instant question-answering by dragging in documents, allowing users to set up a private RAG system within just 5 minutes.

Local Deploy⭐ 17.0kMIT

SWE-agent

Developed by Princeton — enables Claude and GPT to independently use terminals to fix GitHub Issues, leading in terms of SWE-bench performance.

Programming⭐ 17.0kMIT

CrewAI

48K★ Chat & Role-Play powered by multi-Agent orchestration — define roles, goals, and backstories, allowing the various Agent & Automation components to work together autonomously.

Agent⭐ 48.0kMIT

CogVideo (Zhipu AI)

Developed by Zhipu AI, this is an optimized Chinese-language video generation model that enables fast creation of 5-second videos.

Image & Video⭐ 8.5kApache-2.0

big-AGI

An AI programming platform designed for developers, featuring multi-session parallel processing, tool invocation capabilities, code highlighting, and the ability to execute functions.

Chat⭐ 9.0kMIT

OpenSearch + RAG

An AWS open-source search engine combined with an RAG plugin — enabling massive document retrieval, vector search, and full-text search capabilities.

Knowledge Base⭐ 10.3kApache-2.0

Diffutoon

Convert real-person videos into anime style — SD-powered video style transfer that transforms content from photo quality to anime quality.

Comics⭐ 8.5kApache-2.0

Baichuan2 (Baichuan AI)

It features robust Chinese language training data—available in 7B and 13B parameter versions—with consistent performance on Chinese instructions, making it suitable for commercial use.

Open Models⭐ 11.0kBaichuan License

LiveTalking

A real-time Digital Human end-to-end solution — comprising MuseTalk, Wav2Lip, ER-Nerf, along with WebRTC for live streaming transmission.

Digital Human⭐ 6.9kMIT

FluxMusic

Rectified Flow Music & Audio Generation: Stable training process combined with fast generation speed

Music & Audio⭐ 1.7kMIT

OfficeClaw (Huawei Cloud)

An enterprise-grade AI-powered Office Productivity Agent & Automation solution featuring a team of multiple Agent & Automation experts, along with fully editable PPT templates for all elements.

Office⭐ 3.2kApache-2.0

videodl

A video downloader compatible with over 30 platforms — including TikTok, Bilibili, YouTube, and more — that prioritizes high-definition downloads without watermarks.

Video Tools⭐ 4.5kMIT

KoboldCPP

A specialized inference engine for creative writing — featuring single-file deployment, multimodal capabilities, and MCP bridging.

Local Deploy⭐ 15.0kAGPL-3.0

Plandex

Designed specifically for large-scale projects — featuring 2 million token context capacity, a differential review sandbox, and support for combining multiple LLMs together.

Programming⭐ 14.0kApache-2.0

AutoGen / AG2 (Microsoft)

48K★ Microsoft’s multi-Agent conversation framework — GroupChat group chat mode, code generation and execution capabilities, and an alternative version of MAF.

Agent⭐ 48.0kCC-BY-4.0

Bernini (ByteDance)

Open-source by ByteDance — a unified framework for Image & Video Generation that offers one-stop solutions for creation, repair, and enhancement.

Image & Video⭐ 6.2kApache-2.0

Chroma

An AI-native vector database specifically designed for RAG, featuring integration in just 5 lines of code to deliver an optimal development experience for developers.

Knowledge Base⭐ 18.0kApache-2.0

MiniCPM-3 (OpenBMB)

Small parameter size, yet impressive capabilities — 4B parameters deliver performance comparable to 7B models, and it runs smoothly on consumer-grade GPUs.

Open Models⭐ 14.0kApache-2.0

Sonic (Tencent & Zhejiang Univ.)

High-precision lip synchronization for a single photo — supports long videos with a straightforward workflow.

Digital Human⭐ 7.5kApache-2.0

SongBloom

A two-phase strategy that starts with a rough approach followed by refinement — alternating between autoregressive sketching and diffusion-based polishing to ensure coherent structure.

Music & Audio⭐ 652MIT

igedits

An open-source alternative to OpusClip — it enables automatic editing of long videos into short clips, featuring functions such as subtitle addition, centered face alignment, and popularity scoring.

Video Tools⭐ 3.8kMIT

privateGPT

A privacy-first document Q&A system — 100% offline RAG functionality powered by Milvus and Qdrant vector databases.

Local Deploy⭐ 12.0kApache-2.0

Kilo Code

Over 500 pre-trained models, 5 specialized professional modes, and parallel collaboration among multiple Agent & Automation components — the AutoFree mode operates at zero cost.

Programming⭐ 13.0kApache-2.0

Goose

An expandable AI Agent & Automation built with 48K★ Rust — offering integrated installation, execution, editing, and testing capabilities, compatible with any large language model (LLM).

Agent⭐ 48.0kApache-2.0

Hotshot-XL

A native Image & Video Generation model for SDXL — high-quality AI animated GIF creation achieved through fine-tuning of SDXL.

Image & Video⭐ 5.8kApache-2.0

Agnaistic

A lightweight front-end tool specialized in Chat & Role-Play, featuring Markdown-based character cards, text-based adventure scenarios, and one-click Local Deployment.

Chat⭐ 4.5kAGPL-3.0

Piper

An extremely lightweight embedded text-to-speech solution that runs on Raspberry Pi, supporting 15 languages with lightning-fast offline performance.

Voice & TTS⭐ 8.0kMIT

ComicTranslate

Real-time AR AI Comics Translation — Translate content simply by taking a photo with your smartphone, with support for AR overlay display.

Comics⭐ 6.0kMIT

Phi-3.5 (Microsoft)

Microsoft’s “little powerhouse” — based on the MIT protocol, available in 3.8B/7B/14B parameter versions, offering remarkable performance despite its compact size.

Open Models⭐ 12.0kMIT

Ultralight Digital Human

Mobile-based Digital Human – Train and run directly on smartphones, serving as an excellent example of lightweight deployment.

Digital Human⭐ 5.8kMIT

TangoFlux

Flow Matching for text-to-speech — delivering high-fidelity sound effects, ambient noises, and short musical pieces.

Music & Audio⭐ 2.1kMIT

Lingji Cut

A local-first AI video creation workspace that covers the entire workflow from writing scripts to voiceover recording, shooting, and editing.

Video Tools⭐ 2.8kMIT