Skip to content

gpt2模型在清洗数据报错没有megatron module #23

@Michael-1022

Description

@Michael-1022

root@gpu03:/Megatron-LM/tools/openwebtext# python3 cleanup_dataset.py /workspace/data/merged_output.json /workspace/data/merged_cleand.json
Traceback (most recent call last):
File "cleanup_dataset.py", line 12, in
from tokenizer import Tokenizer
File "/root/Megatron-LM/tools/openwebtext/tokenizer.py", line 13, in
from megatron.core.datasets.megatron_tokenizer import MegatronTokenizer
ModuleNotFoundError: No module named 'megatron'
root@gpu03:
/Megatron-LM/tools/openwebtext#

请问这个要怎么解决呢?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions