News
Announcing our $800 M Series C to accelerate the shift to open-source AI
4+ day, 7+ hour ago (428+ words) " Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads " " Announcing our Series C. Intelligence should be abundant, not expensive " " Join us at RAISE 2026 in Paris " " On-demand B200s now available on Together GPU Clusters " " Now serving Mini Max-M3 for…...
Together AI at ICML 2026: frontier research across the full stack
5+ day, 4+ hour ago (992+ words) " Now serving Mini Max-M3 for efficient inference " " On-demand B200s now available on Together GPU Clusters " " Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads " " How Together built the world's fastest speech-to-text stack " " Join us at RAISE 2026 in…...
Form | AI Factory Request
2+ week, 2+ day ago (181+ words) " Now serving Mini Max-M3 for efficient inference " " On-demand B200s now available on Together GPU Clusters " " Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads " " How Together built the world's fastest speech-to-text stack " " Join us at RAISE 2026 in…...
Kimi K2. 7 Code API
3+ week, 1+ day ago (140+ words) Inference for batch workloads Inference on custom hardware Inference for custom models Explore the top open-source models Reliable GPU clusters at scale Custom infrastructure at frontier scale Build development environments for AI Store model weights & data securely Shape models with…...
AI Engineer World's Fair
3+ week, 3+ day ago (214+ words) " Now serving Mini Max-M3 for efficient inference " " On-demand B200s now available on Together GPU Clusters " " Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads " " How Together built the world's fastest speech-to-text stack " " Join us at RAISE 2026 in…...
Building trust in enterprise AI: Together AI earns ISO 27001: 2022 certification
3+ week, 4+ day ago (304+ words) " Now serving Mini Max-M3 for efficient inference " " On-demand B200s now available on Together GPU Clusters " " Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads " " How Together built the world's fastest speech-to-text stack " " Join us at RAISE 2026 in…...
NVIDIA Nemotron 3. 5 ASR API
1+ mon, 1+ day ago (181+ words) Inference for batch workloads Inference on custom hardware Inference for custom models Explore the top open-source models Reliable GPU clusters at scale Custom infrastructure at frontier scale Build development environments for AI Store model weights & data securely Shape models with…...
Together AI and Pearl Research Labs Team Up to Reduce the Cost of AI Inference
1+ mon, 3+ week ago (243+ words) " Flash Attention-4: up to 1. 3" faster than cu DNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of…...
Research Blog
1+ mon, 3+ week ago (181+ words) " Flash Attention-4: up to 1. 3" faster than cu DNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of…...
NVIDIA Nemotron 3 Nano Omni API
2+ mon, 1+ week ago (377+ words) " Flash Attention-4: up to 1. 3" faster than cu DNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of…...