Review: 12 Leading Chinese Companies in the AI Large Model Sector

Since the release of OpenAI's large language model conversation application, ChatGPT, in late November 2022, there has been significant global attention, leading to a surge of interest from academic institutions and technology companies in China's AIGC (Artificial Intelligence Generated Content) sector. Leading companies have adopted a three-tier approach of "model + tool platform + ecosystem" to promote application deployment and stimulate innovation by deeply integrating large models into various scenarios, complemented by professional tools and platforms, creating a virtuous cycle.

The global AI large model market is projected to reach $21 billion by 2023. As technology continues to innovate and large models further develop, businesses and organizations will benefit from enhanced data analysis, predictive capabilities, and intelligent solutions, creating more opportunities and space for development in the AI field, and continuously driving market growth. China's vast market demand and abundant talent reserve provide favorable conditions for the development of large models. It is forecasted that China's large model industry market size will reach CNY 117.9 billion by 2028, with an average growth rate higher than the global average. At present, China and the United States account for over 80% of the world's research and development in the large model domain, indicating China's competitive strength in this area.

Considering the significance and potential of China's large model industry development, this article aims to list 12 leading AI large model companies in China and explore how they are leading industry innovation, enhancing corporate competitiveness, and driving social progress. These companies' explorations and innovations in the large model field not only strengthen the generality of AI technology but also actively contribute to the goal of achieving universal AI accessibility.

1. Alibaba, Tongyi Qianwen (Chinese: 阿里巴巴, 通义千问)

As one of China's largest e-commerce platforms, Alibaba has been committed to utilizing artificial intelligence to enhance user experience and business efficiency. In the large model domain, Alibaba took its initial step in 2019 by introducing PLUG, a general-purpose dialogue framework based on pre-trained language models, marking Alibaba's first attempt in the LLM (Large Language Model) domain. In November 2021, Alibaba DAMO Academy announced the M6 large model, a multimodal large model based on 100 trillion parameters, instantly becoming the world's largest AI pre-training model.

On April 7, 2023, Alibaba Cloud launched its self-developed large model "Tongyi Qianwen" and opened it to enterprises and invited users. According to the official description, "Tongyi Qianwen" is a language large model specifically designed to respond to human instructions, capable of understanding and answering questions from various domains, including common, complex, and even rare questions. Behind "Tongyi Large Model" is Alibaba's unified brand for large models, encompassing language, auditory, multimodal fields, dedicated to achieving general intelligence close to human wisdom and transforming AI from "single-sensory" to "fully open senses."

2. Baidu, ERNIE Bot (Chinese: 百度, 文心一言)

ERNIE Bot is a generative dialogue product based on ERNIElarge model technology, officially launched by Baidu on February 7, 2023. As a new member of the ERNIE large model family, ERNIE can interact with people, answer questions, assist in creative endeavors, efficiently and conveniently help individuals acquire information, knowledge, and inspiration. ERNIE is a knowledge-enhanced large language model based on Baidu's PaddlePaddle (Chinese: 飞桨) deep learning platform and ERNIE Knowledge Enhancement large model, continuously learning and integrating knowledge from massive data and extensive knowledge to possess characteristics of knowledge enhancement, retrieval enhancement, and dialogue enhancement.

ERNIE's underlying logic involves providing services through Baidu Intelligent Cloud, attracting enterprise and institutional customers to use APIs and infrastructure to collectively build AI models, develop applications, and achieve industrial AI inclusiveness.

As one of the earliest technology companies in China to engage in large model research and development, Baidu was the first to release China's first officially open pre-training model, ERNIE 1.0, in March 2019 and has continued investing in large model research and development upgrades. In December 2021, ERNIE 3.0 was upgraded to become the world's first 100-billion-parameter knowledge-enhanced large model, currently the largest Chinese single-body model, achieving world-leading performance in over 60 authoritative natural language understanding and generation tasks. On July 25th, the comprehensive benchmark for the Chinese general large model, SuperClue, released the latest ranking of Chinese large language models. The ranking shows that ERNIE follows closely behind GPT-4, scoring higher than GPT-3.5 and other domestic large models, exhibiting the best model performance.

3. 4Paradigm, SageGPT (Chinese: 第四范式, 式说)

Founded in September 2014, 4Paradigm is a pioneer and leader in the enterprise-level artificial intelligence field. 4Paradigm provides platform-centered AI solutions and has developed end-to-end enterprise-level AI products using core technology to solve efficiency, cost, and value challenges faced during enterprise intelligent transformation, elevating decision-making capabilities. It has been widely applied in finance, retail, manufacturing, energy and power, telecommunications, healthcare, and other sectors, ranking first in China's platform-centered decision-making enterprise-level AI market.

On April 26, 2023, 4Paradigm publicly showcased its large model product, "Shishi 3.0," and introduced the AIGS strategy (AI-Generated Software): reconstructing enterprise software using generative AI. Shishi aims to be a new development platform based on a multimodal large model, enhancing the experience and development efficiency of enterprise software, realizing "AIGS."

4. iFlytek, Spark (Chinese: 科大讯飞, 星火)

iFlytek Co., Ltd., established in 1999, is a well-known intelligent speech and artificial intelligence listed company in the Asia-Pacific region. Since its inception, the company has been engaged in core technology research, including intelligent speech, natural language understanding, and computer vision, maintaining an international leading position. iFlytek actively promotes the application of AI products and industry, aiming to make machines "listen, speak, understand, and think," contributing to building a better world with artificial intelligence. In 2008, the company was listed on the Shenzhen Stock Exchange.

On May 6, 2023, iFlytek officially launched the iFlytek Spark cognitive large model, conducting live tests with the large model. As a new generation cognitive intelligent large model launched by iFlytek, Spark possesses cross-domain knowledge and language comprehension capabilities, enabling task execution based on natural dialogue. It continuously evolves from massive data and extensive knowledge, achieving a closed-loop process from proposing, planning to solving problems.

Three key milestones for the continuous upgrade of the iFlytek Spark cognitive large model within the year announced are: June 9: Breaking through open question-answering, significantly improving multi-turn dialogue capability, and further upgrading mathematical capabilities; August 15: Upgrading code capabilities and enhancing multimodal interaction capabilities to assist more partners and developer teams; October 24: Achieving comparable performance to ChatGPT in general models, surpassing the current version of ChatGPT in Chinese and reaching a similar level in English, leading the industry in fields like education and healthcare.

5. Langboat, Mengzi GPT (Chinese: 澜舟科技, 孟子生成式大模型)

On March 14, 2023, Langboat Technology, a leading company in the language large model sector, announced the completion of its Pre-A+ round of financing. This round of financing was led by Zhongguancun Science Park Corporation, with continued participation from Sequoia Capital and Innovation Works. Within less than a year, Langboat has raised hundreds of millions of RMB in total funding. Established in 2021, Langboat is a leading cognitive intelligence company dedicated to providing a new generation cognitive intelligence platform for global enterprises based on natural language processing (NLP) technology, facilitating enterprise digital transformation and upgrading. Its main product series include function engines (such as search, generation, translation, and dialogue) and vertical scene applications based on the "Mengzi Pre-training Model."

Langboat has been deeply involved in natural language technology research and development earlier than others in the industry and has accumulated sufficient capabilities with lightweight models. In March 2023, they released "Mengzi MChat," an AI dialogue robot based on "Mengzi Generative Large Model Technology" (Mengzi GPT). Mengzi MChat features "controllable" large models, which are more flexible compared to other similar technologies. They focus on vertical domain and specialized track implementation, enabling rapid adjustments based on industry and regional needs while integrating industry data, knowledge graphs, and real-time retrieval effectively.

6. Cloudwalk, Cong Rong (Chinese: 云从科, 从容大模型)

Founded in 2015, Cloudwalk Technology is the first artificial intelligence platform company successfully listed on the Science and Technology Innovation Board, with stock code 688327. The company aims to bridge the digital and physical worlds through a human-machine collaborative operating system (CWOS) that operates similarly to humans. It practices the integration of multiple large models in areas like vision, speech, NLP, etc., based on data elements, further enhancing core technology research and development in the big data and AI fields, and continuously improving R&D innovation and operational capabilities.

After years of accumulation, Cloudwalk Technology trained sufficiently powerful foundational large models and launched "Cong Rong" in May 2023. Through real-time learning and synchronous feedback, Cong Rong can solve pain points in AI applications, facilitating rapid adoption of personalized applications. Additionally, it possesses context learning capabilities, achieving better interaction, especially in interactive scenarios such as finance and gaming, where multi-turn dialogue technology will be fully utilized in the human-machine collaborative operating system.

7. Zhipu AI, ChatGLM (Chinese: 智谱AI, ChatGLM)

Zhipu AI is a company derived from the technology transfer of the Department of Computer Science at Tsinghua University, committed to creating a new generation of cognitive intelligent general models. The company collaboratively developed a bilingual hundred-billion-parameter ultra-large-scale pre-training model, GLM-130B, and constructed a high-precision universal knowledge graph, forming a data and knowledge dual-wheel-driven cognitive engine. Based on this model, they created ChatGLM. In addition, Zhipu AI also launched the cognitive large model platform Bigmodel.ai, including products like CodeGeeX and CogView, providing intelligent API services, connecting billions of users in the physical world, empowering digital humans in the metaverse, and endowing machines with "thinking" capabilities like humans.

In July 2023, it was reported that Zhipu AI had completed its B-2 round of financing several months ago, raising hundreds of millions of RMB, with exclusive investment from Meituan, valuing the company at nearly 500 million USD.

8. China Telecom Digital Intelligence Technology, TeleChat (Chinese: 中国电信数字智能科技, TeleChat)

On May 19, 2022, China Telecom established its subsidiary, China Telecom Digital Intelligence Technology Co., Ltd., as a technology-driven and platform-based professional company engaged in big data and AI business under China Telecom. Formerly, it was the data center of China Telecom Group. China Telecom Digital Intelligence Technology aims to build a ten-thousand-level AI algorithm cabin as its research and development goal and become a hundred-billion-level AI service provider as its business development goal. Relying on proprietary algorithms and equipment, the company creates standardized products and platforms that are multi-scenario, multi-application, and replicable, further strengthening core technology research and development and operational capabilities in the big data and AI fields.

On July 6, 2023, China Telecom Digital Intelligence Technology officially released the China Telecom large language model "TeleChat" and demonstrated products that utilize the large model's capabilities in three directions: data center empowerment, intelligent customer service, and smart governance. Leveraging the advantages of cloud-network convergence, China Telecom has created the large language model "TeleChat." TeleChat uses a large amount of high-quality Chinese and English language data for pre-training and adopts tens of millions of question-answer data for fine-tuning. It also designed a progressive expansion attention mechanism to increase the model's receptive field through interval sampling, developed a self-calibration fine-tuning technology, utilizing the iterative relevance bias as reinforcement learning's supervision signal, and employed knowledge graph synergistic enhancement strategies to augment the pre-training and inference capabilities of the large model, reducing the occurrence of hallucinations.

9. Wenge Group, YAYi (Chinese: 中科闻歌, 雅意大模型)

Wenge Group is an artificial intelligence company incubated by the Institute of Automation, Chinese Academy of Sciences (CAS). It focuses on complex data analysis and AI-assisted decision-making and has developed the DIOS platform, a cognitive and decision-making intelligent foundation with completely independent intellectual property rights. DIOS enables the transformation and upgrade of various industries towards digitalization and intelligence.

The core team of Wenge Group comes from CAS and other top research institutions at home and abroad, making them pioneers and promoters in the field of secure information studies. The team has accumulated over a decade of theoretical research, technical development, and application practices in big data and AI, with more than 100 patent applications and over 3,000 self-developed core algorithms. Their core capabilities include deep semantic understanding, social computing, and AI platform engineering.

On June 3, 2023, Wenge Group released the secure and reliable enterprise-level exclusive large model "YAYi," equipped with five core capabilities: real-time networked question-answering, domain knowledge question-answering, multilingual content understanding, complex scene information extraction, and multimodal content generation. It offers over 100 distinctive skills and can quickly integrate government and enterprise data to generate large model-specific application services. YAYi can be applied in various fields such as media, finance, publicity, governance, and security.

10. Huawei, Pangu (Chinese: 华为, 盘古大模型)

The Huawei Pangu series of large models include five foundational large models (L0): Chinese language (NLP) large model, visual (CV) large model, multimodal large model, scientific computing large model, and graph network (Graph) large model. The Chinese language (NLP) large model is the first super-hundred-billion-parameter Chinese pre-training large model in the industry, considered to be the AI large model that most closely approximates human Chinese understanding ability. Unlike foreign AI models like ChatGPT, Huawei's Pangu large models are optimized for the Chinese language.

In addition to the five foundational large models (L0), the Pangu large models are continuously evolving, divided into three levels: L0, L1, and L2. L0 represents foundational large models, L1 refers to industry large models, and L2 is for inference models focused on more specialized scenarios. At the L1 level, Huawei has already launched industry large models such as Pangu Financial, Pangu Mining, Pangu Meteorology, Pangu Power, Pangu Manufacturing Quality Inspection, Pangu Pharmaceutical Molecule, and more. In the L2 segment, Huawei has introduced various applications based on the Pangu large models, including short-term meteorological forecasts and typhoon predictions based on the meteorological large model, unmanned aerial vehicle power inspection and power defect recognition based on the power large model, and fashion-assisted design and fashion copyright protection based on the fashion large model. Additionally, in the fields of the Internet of Things, intelligent cockpits, intelligent driving, etc., Huawei has launched various applications based on the Pangu large models.

11. SenseTime, SenseNova (Chinese: 商汤科技, 日日新)

On April 10, 2023, SenseTime introduced the "SenseNova" large model system, including multiple large models and capabilities for natural language processing (SenseChat (Chinese: 商量)), content generation, automatic data annotation, custom model training, etc. SenseTime has proactively built a new type of AI infrastructure, the SenseCore AI large facility, which consists of 27,000 GPU chips, providing a total computing power of 5.0 exaFLOPS. It is one of the largest intelligent computing platforms in Asia. Leveraging SenseCore, SenseTime plans to achieve a research and development system that combines "large models + large computing power" in an integrated manner.

SenseTime has accumulated five years of experience in the large model field. In 2019, they released a vision large model with 1 billion parameters, and last year they unveiled a vision large model with 32 billion parameters, which is the largest visual model in the world to date. In the NLP field, SenseTime's large language models have also reached the level of hundreds of billions of parameters. Recently, SenseTime open-sourced the multi-modal model "Shusheng 2.5" with 3 billion parameters. In the AIGC field, SenseTime also has 1 billion parameter models, which support various functions related to text and graph data.

12. Tencent, HunYuan (Chinese: 腾讯, 混元大模型)

In April 2022, Tencent first revealed the progress of its HunYuan AI large model, a large-scale intelligent model covering multiple areas, including CV, NLP, multimodal content understanding, copywriting generation, and video content understanding. In December 2022, Tencent launched the trillion-parameter Chinese NLP pre-training model "HunYuan-NLP-1T", which achieved a score exceeding 80.888 in the CLUE benchmark for natural language understanding, ranking first and setting a new record in the benchmark's history.

The HunYuan large model has been successfully applied in Tencent's internal products such as advertising, search, and dialogue, as well as serving external customers through Tencent Cloud services. Tencent's HunYuan large model is built on its powerful underlying computing power and low-cost high-speed network infrastructure, carried by Tencent's self-developed Taiji Machine Learning platform. In July 20s23, Tencent disclosed the progress of its industry large model plan and released the panorama of MaaS (Model-as-a-Service) capabilities, emphasizing the vision of creating "more industry-specific and easier-to-implement industry large models" to help enterprises build exclusive industry large models.