Llama 2 7B Text Completion with Torchrun

This script demonstrates how to use Torchrun to run a text completion model based on Llama 2 7B.

Requirements:

Python 3.6 or higher
PyTorch
Transformers library

Installation:

pip install torch transformers

Usage:

python run_text_completion.py --ckpt_dir 'llama-2-7b/' --tokenizer_path 'tokenizer.model' --max_seq_len 128 --max_batch_size 4

Code:

import argparse
from example_text_completion import main

parser = argparse.ArgumentParser()
parser.add_argument('--ckpt_dir', type=str, default='llama-2-7b/')
parser.add_argument('--tokenizer_path', type=str, default='tokenizer.model')
parser.add_argument('--max_seq_len', type=int, default=128)
parser.add_argument('--max_batch_size', type=int, default=4)
args = parser.parse_args()

main(args)

Explanation:

run_text_completion.py: This is the Python script containing the code. It uses the argparse library to define command-line arguments for configuring the model execution.
ckpt_dir: Specifies the directory where the model checkpoint is located.
tokenizer_path: Provides the path to the tokenizer model file used for encoding and decoding text.
max_seq_len: Defines the maximum length of input sequences.
max_batch_size: Sets the maximum batch size for model inference.
main(args): This function is from the example_text_completion.py file and handles the core text completion logic using the provided arguments.

This script provides a basic example of using Torchrun to run a Llama 2 7B text completion model. You can modify the script and arguments to fit your specific needs and experiment with different model configurations.

Llama 2 7B Text Completion with Torchrun