# ModelLink **Repository Path**: edwardvitor/ModelLink ## Basic Information - **Project Name**: ModelLink - **Description**: 昇腾大模型仓库 - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: https://gitee.com/ascend/ModelLink/ - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1210 - **Created**: 2024-07-29 - **Last Updated**: 2024-07-29 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
简体中文 | English
ModelLink aims to provide end-to-end large language model solutions for Huawei Ascend chips, including models, algorithms, and downstream tasks. --- ## ModelLink Solution Overview ### Supported features Current ModelLink supported features for large model usage: * [Dataset Preparation for Pre-training](#jump11)/[Fine-tuning Instruction Dataset Preparation](#jump12) * [Pre-training](#jump13)/[Full-parameter Fine-tuning](#jump14)/[Low-parameter Fine-tuning](#jump15) * [Inference: human-machine dialogue](#jump16) * [Evaluation with numerous benchmarks](#jump17) * [Utilizing Acceleration Features (Acceleration Algorithms + Fusion Operators)](#jump18) * [Profiling data based on Ascend chips](#jump19) * [Convert ckpt between huggingface and megatron](#jump19) * [Enbale deterministic computing function for Ascend](#jump21) * [Enable high availability features based on Ascend chips ](#jump22) More novel and useful features are developing for LLMs training on Ascend ... --- ## ModelLink Maintenance Policies ModelLink release has the following five maintenance phases: | **Status** | **Duration** | **Description** | |-------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------| | Planning | 1-3 months | Plan features. | | Development | 3 months | Develop features. | | Maintained | 6-12 months | Allow the incorporation of all resolved issues and release the version, Different versions of ModelLink adopt varying support policies. The maintenance periods for Regular Releases and Long-Term Support versions are 6 months and 12 months, respectively. | | Unmaintained | 0-3 months | Allow the incorporation of all resolved issues. No dedicated maintenance personnel are available. No version will be released. | | End Of Life (EOL) | N/A | Do not accept any modification to a branch. | The Released Version Maintenance Policy for ModelLink is as follows: | **ModelLink** | **Maintenance Policies** | **Status** | **Launch Date** | **Subsequent Status** | **EOL Date** | |---------------|--------------------|--------------|-----------------|-----------------------------------|-----------| | 1.0 | Regular Release | Maintained | 2024/03/30 | Unmaintained 2024/09/30 estimated | | | 1.1 | Regular Release | Maintained | 2024/06/30 | Unmaintained 2024/12/30 estimated | | --- ### Supporting version and models 【Please note the corresponding environment versions for model usage, as follows】 | Software | [Version](https://www.hiascend.com/zh/) | | :-----------------------: |:----------------------------------:| | Python | 3.8 | | driver | under development version | | firmware | under development version | | CANN | under development version | | torch | 2.1.0、2.2.0 | | torch_npu | under development version | 【Based on the current version of megatron, the performance statistics from our testing are as follows (Hardware info:Atlas 900 A2 PODc)】 For the supported following models, we provide training scripts and readme instructions in the examples folder, which contain detailed processes for model training, inference, and evaluation. The following is the measured performance after turning on the mc2 acceleration feature, which is only supported in the version above 24RC2, the repository code level is closed by default, if you want to use, please refer to the `Acceleration Features` chapter on the home page.| Model | Parameters | Sequence | Cluster | Mode | Performance | Reference | Contributor |
|---|---|---|---|---|---|---|---|
| Aquila | 7B | 2K | 1x8 | BF16 | 2849 | 2874 | 【Ascend】 |
| Aquila2 | 7B | 2K | 1x8 | FP16 | 3323 | 2673 | 【Community】 |
| 34B | 4K | 2x8 | BF16 | 854 | 732 | 【Community】 | |
| Baichuan | 7B | 4K | 1x8 | FP16 | 2685 | 2036 | 【Ascend】 |
| 13B | 4K | 1x8 | FP16 | 1213 | 862 | 【Ascend】 | |
| Baichuan2 | 7B | 4K | 1x8 | BF16 | 2664 | 3969 | 【Ascend】 |
| 13B | 4K | 1x8 | BF16 | 1668 | 2062 | 【Ascend】 | |
| Bloom | 7B1 | 2K | 1x8 | FP16 | 2034 | 2525 | 【Ascend】 |
| 176B | 2K | 12x8 | BF16 | 100 | 107 | 【Ascend】 | |
| ChatGLM3 | 6B | 8K | 1x8 | FP16 | 4297 | 4267 | 【Community】 |
| CodeLlama | 34B | 4K | 2x8 | BF16 | 837 | 762 | 【Community】 |
| InternLM | 7B | 2K | 1x8 | BF16 | 2776 | 2854 | 【Ascend】 |
| 65B | 2K | 4x8 | BF16 | 341 | 414 | 【Ascend】 | |
| LLaMA | 7B | 2K | 1x8 | FP16 | 3600 | 3804 | 【Ascend】 |
| 13B | 2K | 1x8 | FP16 | 1895 | 2012 | 【Ascend】 | |
| 33B | 2K | 4x8 | FP16 | 621 | 776 | 【Ascend】 | |
| 65B | 2K | 4x8 | BF16 | 348 | 426 | 【Ascend】 | |
| LLaMA2 | 7B | 4K | 1x8 | BF16 | 4200 | 3850 | 【Ascend】 |
| 13B | 4K | 1x8 | BF16 | 1990 | 1920 | 【Ascend】 | |
| 34B | 4K | 2x8 | BF16 | 749 | 796 | 【Ascend】 | |
| 70B | 4K | 4x8 | BF16 | 420 | 430 | 【Ascend】 | |
| LLaMA3 | 8B | 8K | 1x8 | BF16 | 2483 | 2674 | 【Ascend】 |
| 70B | 8K | 8x8 | BF16 | 283 | 355 | 【Ascend】 | |
| Qwen | 7B | 8K | 1x8 | BF16 | 2499 | 2867 | 【Ascend】 |
| 14B | 2K | 1x8 | BF16 | 1560 | 1578 | 【Ascend】 | |
| 72B | 8K | 16x8 | BF16 | 285 | 345 | 【Ascend】 | |
| Qwen1.5 | 0.5B | 8K | 1x8 | BF16 | 22834 | 25306 | 【Community】 |
| 1.8B | 8K | 1x8 | BF16 | 13029 | 12181 | 【Community】 | |
| 4B | 8K | 1x8 | BF16 | 5033 | 5328 | 【Community】 | 7B | 8K | 1x8 | BF16 | 2862 | 2621 | 【Community】 |
| 14B | 8K | 1x8 | BF16 | 1717 | 1702 | 【Community】 | |
| 32B | 8K | 4x8 | BF16 | 751 | 708 | 【Community】 | |
| 72B | 8K | 8x8 | BF16 | 301 | 317 | 【Ascend】 | |
| Yi | 34B | 4K | 2x8 | BF16 | 809 | 730 | 【Community】 |
| Mixtral | 8x7B | 32K | 8x8 | BF16 | 702 | 837 | 【Ascend】 |
| Mistral | 7B | 32K | 1x8 | BF16 | 2806 | 2734 | 【Ascend】 |
| Gemma | 2B | 8K | 1x8 | BF16 | 6821 | 7602 | 【Ascend】 |
| 7B | 8K | 1x8 | BF16 | 2938 | 2607 | 【Ascend】 | |
| GPT3 | 175B | 2K | 16x8 | FP16 | 153 | -- | 【Community】 |
| 15B | 2K | 1x8 | FP16 | 1890 | 1840 | 【Community】 | |
| Grok1 | 40B | 8K | 2x8 | BFP16 | 1646 | 2057 | 【Ascend】 |
| Scenario | Features | Arguments | Mcore Support | Legacy Support |
|---|---|---|---|---|
| PTD Parallel | Tensor Parallel | --tensor-model-parallel-size | Yes | Yes |
| Pipeline Parallel | --pipeline-model-parallel-size | Yes | Yes | |
| Dynamic division for PP | --num-layer-list | Yes | Yes | |
| Sequence Parallel | --sequence-parallel | Yes | Yes | |
| Distributed Optimizer | --use-distributed-optimizer | Yes | Yes | |
| Context Parallel | Context Parallel | --context-parallel-size | Yes | No |
| Various Cp Algorithm | --context-parallel-algo | Yes | No | |
| Send/Recv Overlap | --cp-send-recv-overlap | Yes | No | |
| MOE Parallel | MOE Parallel | --expert-model-parallel-size | Yes | No |
| MOE permutation communication optimization | --moe-permutation-async-comm | Yes | No | |
| Memory Optimization | Re-computation | --recompute-granularity | No | Yes |
| Fused Kernel | Flash Attention | --use-flash-attn | Yes | Yes |
| Fused Rmsnorm | --use-fused-rmsnorm | Yes | Yes | |
| Fused Swiglu | --use-fused-swiglu | Yes | Yes | |
| Fused Rotary Position Embedding | --use-fused-rotary-pos-emb | Yes | Yes | |
| Sliding Window Attention | --sliding-window | Yes | Yes | |
| Communication | Overlap Grad Reduce | --overlap-grad-reduce | Yes | Yes |
| Overlap Param Gather | --overlap-param-gather | Yes | No | |
| MC2 | --use-mc2 | Yes | Yes |