Advancing Multimodal
Generative AI Research
Multimodal LLM
Multimodal Diffusion
HPT 1.5 Edge is our latest open-sourced model for edge devices.
With only around 4B parameters, Edge is extremely efficient and still achieved impressive results on many challenging benchmarks (MMMU, POPE, SEED-I, and more). We publicly release the model on Huggingface and Github.
- HPT 1.5 Edge achieves competitive performances, with the best results on MMMU, POPE, and MathVista among models with similar size.
HPT 1.5 Air is our best open-sourced 8B Multimodal Llama 3.
Our hyper capable HPT 1.5 Air packs a punch on real world understanding and complex reasoning. HPT 1.5 Air achieves the best results among <10B models across a wide range of challenging benchmarks (MMMU, POPE, SEED-I, and more). HPT 1.5 Air is publicly available on Hugging Face and GitHub.
- HPT 1.5 Air is the best publicly available multimodal Llama 3, achieving the best results on the challenging MMMU benchmarks.
- HPT 1.5 Air achieves lower hallucination (best POPE results) while showing superlative results on all four benchmarks.
HPT Pro is HyperGAI's proprietary and most optimized model, highly capable of solving very complex multimodal tasks.
HPT Pro outperforms other larger proprietary models such as GPT-4V and Gemini Pro on the MMBench and SEED-Image benchmarks.
HPT Pro achieves state-of-the-art results for a model of its size on the MMMU leaderboard.
- HPT 1.0 Pro demonstrates the best result among models of similar size in multimodal understanding, evaluated on both MMBench and MMBench-CN.
- HPT 1.0 Pro ranks second on the MMMU(val) for college-level understanding.
- HPT 1.0 Pro performs the best in visual perception and understanding as seen on SEED(Img).
HPT Air is HyperGAI's first free to use, open source model.
Our most efficient model for its size, HPT Air is capable of solving a wide range of vision and language tasks. HPT Air is publicly available and achieves state-of-the-art results among all other open-source multimodal LLM models of similar or smaller size on the MMMU benchmark.
- HPT 1.0 Air demonstrates the best result among models of similar size in multimodal understanding in English, evaluated on the MMBench.
- HPT 1.0 Air achieves the best result on the MMMU(val) for college-level understanding and reasoning.
- HPT 1.0 Air ranks second in visual perception and understanding as seen on SEED(Img).
---- Dr. Steven Hoi, Founder and CEO