Pretrained Language Model

This repository provides the latest pretrained language models and its related optimization techniques developed by Huawei Noah's Ark Lab.

Directory structure

PanGu-α is a Large-scale autoregressive pretrained Chinese language model with up to 200B parameter. The models are developed under the MindSpore and trained on a cluster of Ascend 910 AI processors.
NEZHA-TensorFlow is a pretrained Chinese language model which achieves the state-of-the-art performances on several Chinese NLP tasks developed under TensorFlow.
NEZHA-PyTorch is the PyTorch version of NEZHA.
NEZHA-Gen-TensorFlow provides two GPT models. One is Yuefu (乐府), a Chinese Classical Poetry generation model, the other is a common Chinese GPT model.
TinyBERT is a compressed BERT model which achieves 7.5x smaller and 9.4x faster on inference.
TinyBERT-MindSpore is a MindSpore version of TinyBERT.
DynaBERT is a dynamic BERT model with adaptive width and depth.
BBPE provides a byte-level vocabulary building tool and its correspoinding tokenizer.
PMLM is a probabilistically masked language model. Trained without the complex two-stream self-attention, PMLM can be treated as a simple approximation of XLNet.
TernaryBERT is a weights ternarization method for BERT model developed under PyTorch.
TernaryBERT-MindSpore is the MindSpore version of TernaryBERT.
HyperText is an efficient text classification model based on hyperbolic geometry theories.
BinaryBERT is a weights binarization method using ternary weight splitting for BERT model, developed under PyTorch.
AutoTinyBERT provides a model zoo that can meet different latency requirements.
PanGu-Bot is a Chinese pre-trained open-domain dialog model build based on the GPU implementation of PanGu-α.
CeMAT is a universal sequence-to-sequence multi-lingual pre-training language model for both autoregressive and non-autoregressive neural machine translation tasks.
Noah_WuKong is a large-scale Chinese vision-language dataset and a group of benchmarking models trained on it.
Noah_WuKong-MindSpore is a MindSpore version of Noah_WuKong.
CAME is a Confidence-guided Adaptive Memory Efficient Optimizer.

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
AutoTinyBERT		AutoTinyBERT
BBPE		BBPE
BinaryBERT		BinaryBERT
CAME		CAME
CeMAT		CeMAT
DynaBERT		DynaBERT
HyperText		HyperText
JABER-PyTorch		JABER-PyTorch
NEZHA-Gen-TensorFlow		NEZHA-Gen-TensorFlow
NEZHA-PyTorch		NEZHA-PyTorch
NEZHA-TensorFlow		NEZHA-TensorFlow
Noah_WuKong		Noah_WuKong
Noah_Wukong-MindSpore		Noah_Wukong-MindSpore
PMLM		PMLM
PanGu-Bot		PanGu-Bot
PanGu-α		PanGu-α
TernaryBERT-MindSpore		TernaryBERT-MindSpore
TernaryBERT		TernaryBERT
TinyBERT-MindSpore		TinyBERT-MindSpore
TinyBERT		TinyBERT
README.md		README.md

huawei-noah/Pretrained-Language-Model

Folders and files

Latest commit

History

Repository files navigation

Pretrained Language Model

Directory structure

About

Topics

Resources

Stars

Watchers

Forks

Languages