AI Speed Box

AI News and Information Link Collection (Mint Bear's ignorant SNS scraps, matured to become Visual AI News)
2025 AI Era Human Intelligence Conference
  1. AI
 
 
2025/01/12
Limited Release (Partial Release)
Voice Cursor
  1. AI Sound
  1. ETC sound
 
 
 
 
2024/12/22
Available Now (Available)
Photoshop (Beta) New Feature: Select Body Parts
  1. AI Image
  1. Adobe Photoshop
 
 
 
 
2024/12/19
Available Now (Available)
Report: AIBRAHAM
Kling 1.6
  1. AI Video
  1. Kling
 
 
 
 
2024/12/19
Available Now (Available)
Ideogram Batch Generation
  1. AI Image
  1. Ideogram
 
2024/12/18
Available Now (Available)
Midjourney Office Hours (2024-12-18)
  1. AI Image
  1. Midjourney
 
 
2024/12/18
Coming Soon
Veo 2
  1. AI Video
  1. _Google
 
 
 
 
2024/12/17
Coming Soon
Midjourney Moodboards
  1. AI Image
  1. Midjourney
2024/12/17
Available Now (Available)
Google's New AI Glasses (Android XR)
  1. AR, XR, VR
  1. _Google
 
 
2024/12/16
Coming Soon
Video watermarking technology, Meta Video Seal
  1. AI Video
  1. _Meta
 
 
 
 
2024/12/15
Available Now (Available)
Pika 2.0 Update
  1. AI Video
  1. Pika
 
 
 
2024/12/15
Available Now (Available)
Motivo by Meta
  1. AI 3D
  1. _Meta
 
 
 
2024/12/15
Available Now (Available)
Leffa by Meta
  1. AI Image
  1. _Meta
 
 
 
2024/12/14
Available Now (Available)
The Gemini 2.0
  1. AI LLM
  1. Genmini
 
 
 
 
2024/12/13
Limited Release (Partial Release)
Krea Editor Updates
  1. AI Image
  1. Krea
 
 
2024/12/13
Trellis Trellis 3D
  1. AI 3D
  1. 3D
 
 
2024/12/12
Available Now (Available)
Rodin
  1. AI 3D
  1. ETC
 
 
2024/12/12
Available Now (Available)
Midjourney Patchwork
  1. AI Image
  1. Midjourney
 
 
2024/12/12
Available Now (Available)
DiffSensei
  1. AI Toons
  1. ETC toons
 
 
 
 
2024/12/11
Available Now (Available)
MMAudio: Video-to-Audio Synthesis
  1. AI Sound
  1. ETC sound
 
 
 
 
2024/12/11
Available Now (Available)
Sora v2 showing in London
  1. AI Video
  1. Sora
 
 
 
2024/12/09
Available Now (Available)
Leonardo - FlowState
  1. AI Image
  1. Leonardo
 
 
 
 
2024/12/07
Available Now (Available)
ElevenLabs _ Conversational AI
  1. AI Sound
  1. ElevenLabs
 
 
2024/12/06
Coming Soon
Open AI, 12 days of live
  1. AI
  1. OpenAI
  2. OpenAI o1
  3. Sora
 
 
2024/12/05
Limited Release (Partial Release)
Google DeepMind just dropped Genie 2
  1. AI Video
 
 
 
2024/12/05
Available Now (Available)
Swift-Edit
  1. AI Image
  1. ETC Image
 
 
 
2024/12/05
Available Now (Available)
Midjourney Office Hours (2024-12-04)
  1. AI Image
  1. Midjourney
 
 
2024/12/04
Coming Soon
Gen3 KeyFraming (Prototype)
  1. AI Video
  1. Gen-3
 
 
2024/12/03
Coming Soon
Hunyuan Video by Tencent
  1. AI Video
  2. AI Sound
  1. Hunyuan
Tencent launched Hunyuan Video on 2024.12.03, a powerful open source video generation AI model.
https://aivideo.hunyuan.tencent.com
https://huggingface.co/tencent/HunyuanVideo/discussions
https://slashpage.com/mintbear/Hunyuan-01-intro
2024/12/03
Available Now (Available)
Motion Prompting (Google DeepMind)
  1. AI Video
  1. _Google
 
 
 
 
2024/12/03
Coming Soon

Hunyuan Video by Tencent

Category
  1. AI Video
  2. AI Sound
Gen
  1. Hunyuan
Date
2024/12/03
Summary 🍀🧸
Tencent launched Hunyuan Video on 2024.12.03, a powerful open source video generation AI model.
URL
https://aivideo.hunyuan.tencent.com
URL
https://huggingface.co/tencent/HunyuanVideo/discussions
URL
https://slashpage.com/mintbear/Hunyuan-01-intro
Release
Available Now (Available)
Hunyuan Video is an open source AI video model that converts text to video (T2V) released by Tencent in China on 2024.12.03.

Reference

Tencent Official

How to run HunyuanVideo on a single 24gb VRAM card

Etc

Tech

HunyuanVideo is a large-scale model for text-based video generation, adopting the “Dual-stream to Single-stream” hybrid model design to effectively process text and video data. 
1.
Dual-stream stage: Text and video tokens are processed independently through multiple Transformer blocks, allowing each modality to learn its own appropriate representation.
2.
Single-stream stage: Effective fusion of multimodal information is achieved by combining text and video tokens and feeding them to subsequent Transformer blocks.
👍