Run Hugging Face Models Instantly with Day-0 Support from NVIDIA NeMo Framework

thumbnail

Table of Contents

  1. Introduction
  2. Introducing AutoModel in NVIDIA NeMo Framework
  3. How to use AutoModel
  4. Adding a new AutoModel class in NeMo
  5. Conclusion

Introduction

In the quest to maximize generative AI investments, access to the latest model developments is crucial. NVIDIA NeMo Framework leverages Megatron-Core and Transformer-Engine backends to achieve high throughput and Model Flops Utilization (MFU) on NVIDIA GPUs. To ensure Day-0 support for the latest models, NeMo framework introduces the Automatic Model (AutoModel) feature.

Introducing AutoModel in NVIDIA NeMo Framework

AutoModel is a high-level interface simplifying support for pretrained models within the NeMo framework. It enables seamless fine-tuning of any Hugging Face model for quick experimentation, with support for model parallelism, PyTorch JIT compilation, and optimized Megatron-Core support for select models.

How to use AutoModel

To use AutoModel in the NeMo framework for LoRA and Supervised Fine-tuning (SFT), follow these steps:

  1. Instantiate an HF model
  2. Specify LoRA target modules
  3. Utilize Model class, Optimizer module, and Trainer Strategy instead of AutoModel for optimized performance in training and post-training processes.

Adding a new AutoModel class in NeMo

NeMo AutoModel currently supports text generation classes. To add support for other tasks, create a subclass similar to HFAutoModelForCausalLM, adapting the initializer, model configuration, training/validation steps, and save/load methods for your specific use case.

Conclusion

The AutoModel feature in the NeMo framework facilitates rapid experimentation with performant implementations, supporting Hugging Face models without the need for model conversions. It seamlessly integrates with the high-performance Megatron-Core path, enabling users to switch to optimized training with minimal code changes. AutoModel was introduced in the NeMo framework 25.02 release.