Chinese Top AI Institute Creates A Virtual Worm; Former JD.com AI Chief Joins Tsinghua University; Meet AI Video Generator CogVideo

Weekly China AI News from May 30 to June 5

Recode China AI
7 min readJun 6, 2022

News of the Week

BAAI Creates an Artificial Roundworm That Can Wriggle in a Computer

What’s new: Beijing Academy of Artificial Intelligence (BAAI) last week introduced an artificial worm that can wriggle and drink water in simulation. The Chinese AI institute boasted its latest invention as a key step toward building an artificial intelligent organism.

BAAI researchers simulated the biological neural system of a Caenorhabditis elegant (C. elegant), a 1mm-long free-living transparent nematode that lives in temperate soil environments, including all 302 neurons and their subcellular-level connectivity. The virtual roundworm, named MetaWorm 1.0, can move autonomously in a 3D fluid simulation environment. The next step is to train the digital worm to avoid obstacles and seek food.

Why virtual worm? Worms are among the simplest of creatures but they look cleverer than they look. For example, a C. elegant consists of only about 1000 somatic cells but it can perceive the environment, escape from prayers, feed itself, and even breed, making it a good candidate for a model organism. C. elegant is also the first multicellular organism to have its whole genome sequenced. Studying C. elegant will help scientists better understand our complex human brains.

Predecessor: BAAI’s MetaWorm 1.0 is the latest effort to study and create a computer worm. The first of such kind project is OpenWorm, an open-source project initiated in 2013 dedicated to creating the first virtual organism in a computer. Their goal was to build the first comprehensive computational model of C. elegans.

Former IBM Watson Chief Scientist & JD.com AI Chief Joins Tsinghua University

What’s new: Chinese elite school Tsinghua University last week made a high-profile hire by appointing Bowen Zhou, a former chief scientist at IBM Watson and President of Cloud & AI at JD.com, as a tenured professor in the Department of Electronic Engineering and Huiyan Chair Professor. Dr. Zhou is the latest top AI Chinese talent who joins academia from the industry.

Who’s Bowen Zhou: One of the top AI masterminds in China, Dr. Zhou was the President of the Artificial Intelligence Platform & Research of JD.com between 2017 and 2021. Before joining JD.com, Dr. Zhou was with IBM for almost 15 years and was the Chief Scientist of Watson Group, where he was responsible to lead and align Watson Group’s science agenda with IBM’s technical strategy and IBM Research’s cognitive computing and artificial intelligence agenda.

Dr. Zhou specializes in statistical models, machine learning, and human language technologies, including both speech and text. His highest cited paper, A structured self-attentive sentence embedding, proposed a new model for extracting an interpretable sentence embedding by introducing self-attention.

Dr. Zhou received a B.E. degree from the University of Science and Technology of China, Hefei, Anhui, China, in 1996, an M.E. degree from the Chinese Academy of Sciences, Beijing, China, in 1999, and a Ph.D. degree from the University of Colorado at Boulder, Boulder, CO, USA, in 2003, all in electrical engineering.

Open positions: Dr. Zhou’s research group at Tsinghua University has open positions for faculties, researchers, postdoctoral fellows, engineers, and postgraduates. The research group is committed to key technologies of multi-modal interactive digital intelligence, including but not limited to 1) multimodal representation, understanding, generation, reasoning, and interaction, 2) basic theory and a new paradigm of trustworthy artificial intelligence, 3) AI+ industrial digital intelligence application.

AI Can Now Generate Videos out of Text Descriptions

What’s new: While recent text-to-image generators like Google’s Imagen and OpenAI’s DALL-E have gained enough eyeballs, researchers from Tsinghua University and BAAI intended to take a step forward by proposing a text-to-video generator, named CogVideo, that is claimed to outperform all publicly available models at a large margin in the machine and human evaluations. Let’s watch some demos below first.

Given a text prompt of “A woman is running on the beach in the late afternoon”, the model will generate a 480*480 video below.

or a burning heart

Technical details: CogVideo is built on BAAI’s pretrained text-to-image model, CogView2, boosted by a multi-frame-rate hierarchical training strategy to better align text and video clips. As the paper explained, “Input sequence includes frame rate, text, frame tokens. [B] (Begin-of-image) is a separator token, inherited from CogView2. In stage 1, Ts frames are generated sequentially on condition of frame rate and text. Then in stage 2, generated frames are re-input as bidirectional attention regions to recursively interpolate frames. Framerate can be adjusted during both stages. Bidirectional attention regions are highlighted in blue, and unidirectional regions are highlighted in green.” CogVideo is essentially a Transformer with 9 billion parameters pre-trained on a dataset of 5.4 million captioned videos.

Why it matters: CogVideo is said to be the first large-scale open-source text-to-video model. Researchers said they hope this project could be helpful to short video creators or digital artists. Also, CogVideo is a successful inheritance of text-to-image models so researchers don’t have to build text-to-video models from the scratch.

Papers & Projects

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Researchers from Shanghai Jiao Tong University, Digital Brain Lab,
the University of Oxford, Chinese Academy of Science, University College London, and Peking University introduced a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into sequence modeling problems wherein the task is to map agents’ observation sequence to agents’ optimal action sequence. Our goal is to build the bridge between MARL and SMs so that the modeling power of modern sequence models can be unleashed for MARL. Results demonstrate that MAT achieves superior performance and data efficiency compared to strong baselines including MAPPO and HAPPO.

EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers

Researchers from the Chinese University of Hong Kong, Samsung, and the Queen Mary University of London introduced EdgeViTs, a new family of light-weight ViTs that, for the first time, enable attention-based vision models to compete with the best light-weight CNNs in the tradeoff between accuracy and on-device efficiency. This is realized by introducing a highly cost-effective local-global-local (LGL) information exchange bottleneck based on optimal integration of self-attention and convolutions. For device-dedicated evaluation, rather than relying on inaccurate proxies like the number of FLOPs or parameters, they adopt a practical approach of focusing directly on on-device latency and, for the first time, energy efficiency. The models are Pareto-optimal when both accuracy-latency and accuracy-energy trade-offs are considered, achieving strict dominance over other ViTs in almost all cases and competing with the most efficient CNNs.

Machine Learning Compilation Course

Tianqi Chen, an Assistant Professor in the Machine Learning Department and Computer Science Department at Carnegie Mellon University and the Chief Technologist of OctoML, has launched a new machine learning complication course that aims to target audiences who are working on machine learning in the wild. The course, which encompasses 11 episodes, will systematically introduce how machine learning can be deployed in different production environments.

Rising Startups

WM Motor, an electric vehicle startup, has raised $600 million in its Pre-IPO funding round. The company has filed to go public on the Hong Kong Stock Exchange and considers raising about $1 billion. Founded in 2015, the Shanghai-based company doubled sales to 44,152 vehicles in 2021 from a year earlier.

Yuanzhi Technology, a provider of comprehensive information technology solutions, has raised RMB430 million in a strategic funding round. Founded in 2002, the Beijing-based company integrates cloud computing, big data, the Internet of Things, and mobile Internet technologies to promote innovative applications of enterprise and government information resources.

NextVPU, a developer of computer vision systems and chips, has raised hundreds of millions of yuan in its Series C funding round. Founded in 2016, the Shanghai-based company focuses on the research and development of artificial intelligence and computer vision systems and chips and provides vision technology to all robots, drones, unmanned vehicles, and other smart devices.

--

--

Recode China AI

A weekly newsletter on emerging AI trends and technologies in China