Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internlm2.5基座模型的适配,能否更完善下文档? #268

Closed
RyanOvO opened this issue Jul 11, 2024 · 4 comments
Closed

internlm2.5基座模型的适配,能否更完善下文档? #268

RyanOvO opened this issue Jul 11, 2024 · 4 comments
Labels

Comments

@RyanOvO
Copy link

RyanOvO commented Jul 11, 2024

响应issue

可以放出 internlm2.5 微调训练时的 数据集文件列表以及是如何转成 jsonl 的么?
image

@RyanOvO RyanOvO changed the title internlm2.5基座模型的适配,能否更完善下文档? internlm2.5基座模型的适配与推理部署,能否更完善下文档? Jul 11, 2024
@RyanOvO RyanOvO changed the title internlm2.5基座模型的适配与推理部署,能否更完善下文档? internlm2.5基座模型的适配,能否更完善下文档? Jul 11, 2024
@aJupyter
Copy link
Collaborator

好的 后续更新一下

@hi-pengyu
Copy link
Contributor

先提供一个json转jsonl的python脚本,后续完善文档。

import json


def json_array_to_jsonl(json_file_path, jsonl_file_path):
    """  
    将包含JSON数组的文件转换为JSONL格式。  

    参数:  
    - json_file_path: 输入的JSON文件路径  
    - jsonl_file_path: 输出的JSONL文件路径  
    """
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        # 加载整个JSON数组  
        data = json.load(json_file)

    with open(jsonl_file_path, 'w', encoding='utf-8') as jsonl_file:
        # 遍历并写入JSONL文件  
        for obj in data:
            jsonl_file.write(json.dumps(obj, ensure_ascii=False) + '\n')

        # 使用示例


json_array_to_jsonl('aa.json', 'output2.jsonl')

@RyanOvO
Copy link
Author

RyanOvO commented Jul 11, 2024

先提供一个json转jsonl的python脚本,后续完善文档。

import json


def json_array_to_jsonl(json_file_path, jsonl_file_path):
    """  
    将包含JSON数组的文件转换为JSONL格式。  

    参数:  
    - json_file_path: 输入的JSON文件路径  
    - jsonl_file_path: 输出的JSONL文件路径  
    """
    with open(json_file_path, 'r', encoding='utf-8') as json_file:
        # 加载整个JSON数组  
        data = json.load(json_file)

    with open(jsonl_file_path, 'w', encoding='utf-8') as jsonl_file:
        # 遍历并写入JSONL文件  
        for obj in data:
            jsonl_file.write(json.dumps(obj, ensure_ascii=False) + '\n')

        # 使用示例


json_array_to_jsonl('aa.json', 'output2.jsonl')

好的,感谢。

@RyanOvO RyanOvO closed this as completed Jul 11, 2024
@chg0901
Copy link
Collaborator

chg0901 commented Jul 26, 2024

[EmoLLM][InternLM2.5]EmoLLM V3.0 前瞻: 基于InternLM2.5-7B-Chat全量微调实践 - 知乎
https://zhuanlan.zhihu.com/p/708931911

可以参考这个文档

以及在open issue里爹系男友回复里看下

@chg0901 chg0901 added the Informative Responses to Thoughtful Questions Good Answers for QA Issues label Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants