Train a 1.5B-Parameter GPT-2 in Two Days; AI Trader Generates 65% Annual Return; New Sequoia China Fund Eyes Investment in AI, Digital Economy

China’s AI news in the week of January 23, 2022

5 min readJan 24, 2022

LAMB Creator Proposes Better Parallel Training Technique Than Megatron-LM

Increasing the model size has proved effective to boost the performance of AI, leading to the blossom of large-scale models like GPT-3 and Vision Transformer; however, the training of gigantic AI models is prohibitively expensive and time-consuming.

A team of Chinese researchers recently proposed a parallel training technique to accelerate the training for large-scale models. Namely ColossalAI, the framework enables engineers to train a 15-billion-parameter GPT in two days, or an 83-billion-parameter GPT in five days.

Compared to NVIDIA’s parallel training Megatron-LM, ColossalAI is able to accelerate training by 10.7%, a considerable uptick of millions of dollars in training cost. Using ColossalAI, engineers can halve the number of GPUs, freeing up more compute for other applications.

ColossalAI is essentially a well-engineered combination of different parallelization techniques — data parallelism, pipeline parallelism, multiple tensor parallelism, and sequence parallelism. Colossal-AI also expands its data parallelism to higher-dimensional, i.e. 2D/2.5D/3D, which is more efficient than 1D data parallelism utilized by Megatron-LM.

ColossalAI benefits researchers who want to train large-scale models with limited budgets and expertise in distributed training. Training the 175-billion-parameter GPT-3 neural network requires 3.114E23 FLOPS (floating-point operation), which would theoretically take 355 years on a V100 GPU server with 28 TFLOPS capacity and would cost $4.6 million at $1.5 per hour.

The Chinese research team is led by Yang You, a Presidential Young Professor at the National University of Singapore. He is also the creator of LAMB, a widely-adopted adaptive large batch optimizer that shortens the training of BERT from three days to 76 minutes. You recently established a company called HPC-AI Tech, aiming to help train and deploy AI models using distributing software and large-scale AI platforms.

Quantitative trading in China A stock market with FinRL

Lead: AI in quantitative stock trading is no secret in the financial market. What’s trending over the past year is using deep reinforcement learning (DRL), the core technology that crowns the computer in Atari, Go, and Chess, to trade stocks.

What’s new: A team of undergraduate students from the Chinese elite Tongji University recently proposed a quantitative trading program in China A stock market with FinRL. The program studied the historical data of the stock market as a complex imperfect information environment and trained an AI agent to maximize the return and minimize the risk in this environment. You can find the program on GitHub: https://github.com/AI4Finance-Foundation/FinRL-Meta/blob/master/Demo_China_A_share_market.ipynb.

How it works: The research team remodeled the financial trading markets using the Markov decision process (MDP) model and deep deterministic policy gradient (DDPG), a reinforcement learning technique that combines both Q-learning and Policy gradients. They trained and evaluated their agent in FinRL, a virtual universe for financial reinforcement learning.

Results showed given an initial capital of 1,000,000, the agent generated 1,978,179 in one year and a half, an annualized return rate of 64.35%, and a Sharpe ratio of 1.99.

Why it matters: DRL-based stock trading is expected to become an indispensable toolkit for traders. Despite its early stage of development, DRL in stock trading has shown huge applicability potential in making substantial profits.

Sequoia China Sets up Infrastructure Fund Backed by Brookfield

U.S. and China, the two largest economies in the world, have been doubling down on tech-driven new infrastructure to power up their economy. Under the hood of this decade-long new infrastrucure push are numerous investment opportunities, e.g. AI, digital economy, new energy networks.

Sequoia China, the Chinese branch of Silicon Valley investor powerhouse, announced the fundraising completion of a new infrastructure-focused fund, Sequoia China Infrastructure Fund (SCIF).

SCIF will put its own chips in the digital economy, new energy, and life science. Investments will be distributed in new energy infrastructure, modern logistics, cold chain logistics, data centers, new economy business parks, modern manufacturing workshops, and life science parks.

Says Sequoia China Founding and Managing Partner Neil Shen, “Sequoia China Infrastructure Fund will become another partner of choice for Chinese entrepreneurs in supporting companies’ visions with the backdrop of an exciting growth story of China’s new economy. We are thrilled to partner with Brookfield, with its expertise in infrastructure investing globally. Together, we will assist businesses to thrive in every aspect of their expansion needs.”

Despite the US-China’s souring relation and China’s regulation uncertainties, the digital economy and AI in China remain a goldmine for global investors. China wants to boost the digital economy’s share in its gross domestic product — $14.72 trillion in 2021 — from 7.8% in 2020 to 10% by 2025, according to a top-level document released last week.

Brookfield, a leading global alternative asset manager with approximately $650 billion of assets under management, will be the largest limited partner of the fund.

Investment News

AInnovation Technology, a SoftBank-backed enterprise AI solutions provider, has priced its Hong Kong IPO to raise $151 million, according to Bloomberg. Founded in early 2018, the Beijing-based company provides a “full-stack” of AI solutions for manufacturing, financial services, and other industries.
Axera Technology, a semiconductor company developing AI (AI) SoCs for computer vision applications, has raised RMB800 million yuan ($126 million) in its Series A++ funding round. Founded in 2019, the Beijing-based company has introduced two high-performance, energy-efficient AI computer vision chips, AX630A and AX620A.
SiBionics, a MedTech startup specialized in the R&D and commercialization of active implantable medical devices and medical AI, has raised RMB800 million yuan ($126 million) in its Series C funding round. Founded in 2015, the Shenzhen-based company develops brain-computer interface and visual coding, medical imaging artificial intelligence, medical big data and medical robots to search for better solutions to diseases such as retinitis pigmentosa, diabetes mellitus and its complications, gastrointestinal cancers, and cardiovascular & cerebrovascular diseases.