RWKV
RWKV 高级教程RWKV 微调教程

全参微调入门教程

RWKV 模型全参微调教程(入门版)

本文旨在为刚接触 RWKV 模型的用户提供开箱即用的全量微调方法,文中使用的仓库为 RWKV 开源社区成员整理的“微调懒人包”,和官方 RWKV-LM 仓库中的微调方法不兼容

首先,准备一个 Linux 环境(可以是 WSL)和支持 CUDA 的 NVIDIA 显卡,然后往下阅读本文。

配置训练环境

要训练 RWKV 模型,首先要配置训练环境。

配置虚拟环境(conda)

RWKV 默认训练虚拟环境为 conda ,如果不知道如何配置 Linux 系统的 conda ,请参考此文章

安装微调训练所需的软件

要获得 RWKV 模型的最佳训练性能,请使用 python 3.10torch 2.1.2+cu121(或最新版本)、cuda 12.3+、最新的 deepspeed,但保持 pytorch-lightning ==1.9.5

在 Linux 环境中依次运行以下命令:

pip install torch --upgrade --extra-index-url https://download.pytorch.org/whl/cu121
pip install pytorch-lightning==1.9.5 transformers ninja einops tqdm gpustat nvitop bitsandbytes wandb datasets triton==2.2.0 --upgrade

第一条命令通过指定 url 安装 cu121 版本的 torch ,第二条命令安装 RWKV State tuning 所需的其他软件包,并指定安装 1.9.5 版本的 PyTorch Lightning、2.2.0 版本的 triton

-- upgrade 参数意味着如果环境中已经安装了对应的软件包,则将其升级到最新版本。

检查 CUDA 环境

依次运行以下命令,检查 CUDA 库是否可用:

python3 # 启动 python

import torch # 导入 torch 包

torch.cuda.is_available() # 检查 CUDA 是否可用

state-tuning-check-cuda

如果返回值为 True ,证明 CUDA 环境可用。如果返回 False ,请重新安装 CUDA 环境(cuda 12.3+)。

克隆 RWKV-LM 仓库

在 Linux 或 WSL 中,使用 git 命令克隆 RWKV-LM 仓库​:

git clone https://github.com/shoumenchougou/RWKV-LM.git

克隆完成后,使用 cd RWKV-LM/RWKV-v5 命令进入 RWKV-v5 文件夹。

full-ft-simple-cd-rwkv5

下载基底 RWKV 模型

你需要选择一个 RWKV 基底模型作为微调训练的起点,下载 RWKV 模型 并将其放在 RWKV-v5/model 文件夹(推荐下载最新的 RWKV-6 架构模型):

由于本文介绍的是 RWKV 全参微调方法,如果你在使用消费级显卡(4090 或更低级),对 1.5B 或更大参数的 RWKV 模型进行微调训练可能会失败。

如果你正在使用消费级显卡,建议尝试微调 RWKV-5-World-0.4B 模型 以跑通训练流程

如果你无法访问上述 HF 链接,请访问 HF 镜像链接: https://hf-mirror.com/BlinkDL

收集训练语料(数据集)

请遵循以下步骤,收集你的训练数据集,并将其放在 RWKV-v5/data 文件夹里面。

整理 jsonl 语料

首先确认“你希望 RWKV 模型学习哪些知识”。这里的知识可以是某一领域的资料,比如法律问答、金融知识等;也可以是某类任务的文本,如材料总结、角色扮演等。总而言之,你需要收集此类数据,并将其转化为特定的 jsonl 格式。

下面是各类内容的 jsonl 格式参考:

单轮问答: {"text": "User: 问题\n\nAssistant: 答案"}

多轮对话例子:{"text": "User: 问题一\n\nAssistant: 答案一\n\nUser: 问题二\n\nAssistant: 答案二"}

材料问答:{"text": "Instruction: 问题\n\nInput: 材料\n\nResponse: 答案"}

内容生成/续写:{"text": "将每篇文章的内容变成 JSONL 的一行,即使是一百万字的小说也变成一行。"} 或者 {"text": "《标题》\n正文内容"}

小说扩写:{"text": "User: 段落的开头\n\nAssistant: 完整的段落"}

整理完成的 jsonl 文件应该是这样的:

{"text": "System: You are an AI assistant. You will be given a task. You must generate a detailed and long answer.\n\nUser: Write an article based on this \"A man has been charged with murder and attempted murder after a woman and the man she was on a date with were stabbed at a restaurant in Sydney, Australia.\"\n\nAssistant: Title: Tragedy Strikes in Sydney: Victims Stabbed during Dinner Date, Suspect Charged with Murder and Attempted Murder\nIntroduction\nA seemingly ordinary evening took a tragic turn in Sydney, Australia when a romantic dinner date at a local restaurant ended in bloodshed, as a man allegedly attacked a couple, stabbing both victims. Local authorities have charged the suspect with murder and attempted murder in connection with the horrific crime. In this article, we delve into the chilling incident, the suspect, and the impact it has had on the community.\nThe Incident\nThe alarming event took place at a popular restaurant in Sydney, Australia. A woman and her date were enjoying their meal – blissfully unaware of the tragic circumstances that would soon unfold. As they shared conversation and indulged in their dinner, a man approached the couple and, without warning, began to stab them with a sharp object.\nPatrons at the restaurant scrambled for safety, as the assailant continued his violent rampage. Bystanders quickly called emergency services, who promptly arrived at the scene. Although both victims were rushed to a nearby hospital, the woman succumbed to her injuries. The man she was on a date with remained in critical condition, fighting for his life.\nThe Suspect\nInitial investigations revealed that the alleged attacker had no known connections to the victims – adding to the mystifying nature of this sudden and brutal assault. Witnesses reported that the assailant seemed to have no specific motive and appeared to carry out the act senselessly.\nFollowing a thorough investigation, local police identified and arrested the suspect. During the process, it was discovered that the alleged attacker had a history of criminal behavior and a troubled past, though it is unclear if this played a role in the tragic incident.\nAuthorities have formally charged the man with murder and attempted murder in connection with the heinous crime. He awaits a hearing to determine a trial date and, if convicted, could face a life sentence in prison.\nThe Community's Response\nThe shocking nature of the crime has left the residents of Sydney reeling, as they struggle to come to terms with the harrowing event. The restaurant where the attack occurred has since been closed, with a makeshift memorial being created outside to commemorate the victims.\nMany have questioned how such a vicious crime could happen in what is considered to be one of the safest cities in the world. This tragic event has spurred local officials to reassess current security measures and devise strategies to reinforce public safety. Additionally, discussions surrounding mental health and criminal rehabilitation have surfaced as residents seek to comprehend the actions of the alleged perpetrator and prevent similar incidents from occurring in the future.\nIn the wake of the stabbing, the community has banded together with an outpouring of grief and support for the victims and their families. Candlelight vigils have been held, and an online fundraising campaign is underway to assist the surviving victim with his medical expenses and recovery.\nConclusion\nThe tragic attack in Sydney serves as a chilling reminder that senseless acts of violence can happen anywhere and at any time. The community's response to this horrific and seemingly random act of brutality has been one of solidarity and determination to prevent such incidents in the future. As the case unfolds, the victims and their families remain in the hearts of the community, who are grieving the devastating loss of a life cut tragically short and supporting the recovering victim as he continues to endure this unimaginable ordeal."}
{"text": "User: Q:I read this background article the other day: Water stratification is when water masses with different properties - salinity (halocline), oxygenation (chemocline), density (pycnocline), temperature (thermocline) - form layers that act as barriers to water mixing which could lead to anoxia or euxinia.[1] These layers are normally arranged according to density, with the least dense water masses sitting above the more dense layers.  Water stratification also creates barriers to nutrient mixing between layers. This can affect the primary production in an area by limiting photosynthetic processes. When nutrients from the benthos cannot travel up into the photic zone, phytoplankton may be limited by nutrient availability. Lower primary production also leads to lower net productivity in waters.[2]  I am facing a new situation today: High lake had a lot of fresh and salt water mixing, therefore presented layers of stratified water.The water was poor in nutrients.Low lake had no such problems and it was a very abundant and prolific fishing spot.  Using the knowledge I acquired from the background article, how should I answer correctly the following question regarding my new situation: Which lake had fewer dissolved nutrients?\nA:\n\nAssistant: The lake with fewer dissolved nutrients would be High lake. This is because it had layers of stratified water, which acts as a barrier to nutrient mixing. As a result, it was poor in nutrients."}
{"text": "System: You are an AI assistant. You will be given a task. You must generate a detailed and long answer.\n\nUser: Summarize this article in one sentence.\nYou'll find this in your Start menu. You can also press ⊞ Win and type \"xbox.\" You can use the Xbox app to take a screenshot of any game running in Windows 10. You'll find this in the menu on the left side of the screen. This will display the DVR and screenshot settings. The Game DVR will need to be enabled in order to take screenshots with the app. Click the slider to toggle it on. When the Game DVR is enabled, the built-in shortcut is ⊞ Win+Alt+PrtScn. You can click the empty field next to this and set your own custom shortcut if you want. This will open the folder that your screenshots will be saved in. If you want to change where your screenshots are saved, you'll need to move the Captures folder to that location. Screenshots will automatically save to the Captures folder, wherever it is. For example, to move it to your Pictures folder, move up one directory and then drag the Captures folder to the Pictures folder in your sidebar. The Xbox app does not need to be open to use the screenshot feature. It will be saved to the Captures folder that you may have moved earlier.\n\nAssistant: The article explains how to use the Xbox app in Windows 10 to take screenshots of games by enabling the Game DVR and using the built-in shortcut or a custom shortcut, with the screenshots saved in the Captures folder which can be moved to a different location if desired."}
{"text": "System: You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.\n\nUser: what brought him into the field was the idea of louis de broglie that particles have a wave aspect. schrödinger sought our de broglie's work and read it avidly.\nRepeat this setence, but with the correct capitalization.\n\nAssistant: What brought him into the field was the idea of Louis de Broglie that particles have a wave aspect. Schrödinger sought out de Broglie's work and read it avidly.\n1. I capitalized the first letter of the first word of the sentence, \"What.\"\n2. I capitalized the first letter of the proper name \"Louis\" and \"Broglie\" (last name).\n3. I capitalized the first letter of the proper name \"Schrödinger.\"\n4. I corrected the typo \"our\" to \"out.\""}
{"text": "System: You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.\n\nUser: Add spaces: ifmelectronicgmbh•Friedrichstraße1•45128Essen—Wereservetherighttomaketechnicalalterationswithoutpriornotice.—GB—SV5200—03.06.2015\n\nAssistant: The task provided is to add spaces to a long, uninterrupted string of text to make it more readable. While determining where to place spaces, I'll be guided by the rules of the English language, as well as presumptions about the format of addresses, company names, dates, and technical/legal notes.\n1. Breaking down company details:\nFirst, I'm going to separate individual words and details related to the company and address. \n- \"ifmelectronicgmbh\": This appears to be a company name, which should be \"ifm electronic gmbh\".\n- \"•Friedrichstraße1•45128Essen\": This seems to be a street address in Germany (Friedrichstraße 1, 45128 Essen).\n- The bullet character (•) can be considered as a separator.\nAfter making these changes, the phrase so far is: \"ifm electronic gmbh • Friedrichstraße 1 • 45128 Essen\".\n2. Handling reservation clause:\nNext, I notice the segment of text starting with \"Wereservetherighttomaketechnicalalterationswithoutpriornotice.\" This appears to be a standard non-liability clause. Let's break it down:\n- \"Wereservetherighttomaketechnicalalterationswithoutpriornotice\": This can be separated into \"We reserve the right to make technical alterations without prior notice.\"\nAfter adding these modifications, the sentence so far becomes \"ifm electronic gmbh • Friedrichstraße 1 • 45128 Essen - We reserve the right to make technical alterations without prior notice.\"\n3. Identifying product details and date:\nTo wrap up, I'll divide the final details which seem to be related to a product model number and the date:\n- \"—GB—SV5200—03.06.2015\": Here, I can establish that \"GB\" and \"SV5200\" are likely separate entities (looks like product identifiers or codes), and \"03.06.2015\" is probably a date.\nTherefore, this segment becomes: \"- GB - SV5200 - 03.06.2015.\"\nSo, the final version of the separated text is: \"ifm electronic gmbh • Friedrichstraße 1 • 45128 Essen - We reserve the right to make technical alterations without prior notice. - GB - SV5200 - 03.06.2015.\"\nThis step-wise breakdown splits the original text into more readable sentences and phrases while maintaining their meanings and serving the reader's clarity."}
{"text": "System: You are a helpful assistant, who always provide explanation. Think like you are answering to a five year old.\n\nUser: What is the sentiment of the following movie review sentence?\na christmas perennial\n\nAssistant: A \"Christmas perennial\" means a movie that is enjoyed every Christmas and comes back like a tradition. So, the sentiment of this movie review sentence is positive because it suggests that people like to watch this movie again and again during Christmas time."}
{"text": "User: What type of details about the tenors  can be gathered from the following bio?  Bio: the tenors -lrb- formerly known as the canadian tenors -rrb- are a vocal quartet consisting of remigio pereira , victor micallef , fraser walters , and clifton murray . they perform operatic pop music that is a mixture of classical and pop , featuring songs such as `` the prayer '' and `` panis angelicus '' from the former genre , and leonard cohen 's `` hallelujah '' from the latter . originating from canada , pereira from ottawa/gatineau , micallef from toronto , walters from vancouver and murray from port mcneil , the tenors have performed on more than 60 international television programs . they appeared on the oprah winfrey show with celine dion in 2010 , at the 63rd primetime emmy awards , at the opening ceremonies of 2010 winter olympics in vancouver and on itv -lrb- tv network -rrb- diamond jubilee show at windsor castle for hm queen elizabeth ii . the venues they have performed in include the tel aviv opera house , and the mandalay bay events center in las vegas . they also appeared on the 2009 grey cup broadcast and on cbc television 's `` holiday festival of ice '' . they have shared the stage with sarah mclachlan , neil young , paul anka , justin bieber , paul mccartney , brian mcknight , david foster , and natalie cole . universal music group released their platinum-selling self-titled debut album and double-platinum holiday album , the perfect gift . the tenors filmed a pbs special at the smith center for the performing arts that began airing in 2012 . the canadian tenors were renamed the tenors in november 2012 in advance of their third album lead with your heart . also in 2012 , the tenors signed a us distribution deal with verve records . the tenors latest release lead with your heart went platinum in canada in seven weeks . the album garnered a 2013 juno award for adult contemporary album of the year and a juno award nomination for jack richardson for producer of the year for bob ezrin 's production of the title track `` forever young . '' lead with your heart was featured on katie , good...\nThe answer to this question is:\n\nAssistant: From the bio, we can gather that The Tenors:\n1. Were formerly known as The Canadian Tenors.\n2. Consist of Remigio Pereira, Victor Micallef, Fraser Walters, and Clifton Murray.\n3. Perform operatic pop music, a mixture of classical and pop.\n4. Have songs such as \"The Prayer,\" \"Panis Angelicus,\" and Leonard Cohen's \"Hallelujah\" in their repertoire.\n5. Originated from Canada, with members from Ottawa/Gatineau, Toronto, Vancouver, and Port McNeil.\n6. Have performed on more than 60 international television programs, including The Oprah Winfrey Show, the 63rd Primetime Emmy Awards, and the 2010 Winter Olympics opening ceremonies.\n7. Performed at venues such as the Tel Aviv Opera House and Mandalay Bay Events Center in Las Vegas.\n8. Have shared the stage with artists like Sarah McLachlan, Neil Young, Paul Anka, Justin Bieber, Paul McCartney, Brian McKnight, David Foster, and Natalie Cole.\n9. Released platinum-selling self-titled debut album, a double-platinum holiday album called The Perfect Gift, and their third album, Lead With Your Heart.\n10. Filmed a PBS special at the Smith Center for the Performing Arts.\n11. Renamed themselves from The Canadian Tenors to The Tenors in November 2012.\n12. Signed a US distribution deal with Verve Records in 2012.\n13. Received a 2013 Juno Award for Adult Contemporary Album of the Year and a Juno Award nomination for Jack Richardson Producer of the Year for their album Lead With Your Heart."}
{"text": "System: You should describe the task and explain your answer. While answering a multiple choice question, first output the correct answer(s). Then explain why other answers are wrong. Think like you are answering to a five year old.\n\nUser: What is the most logical next event?\nHow to unblock copyright infringement on youtube \nUnderstand why you may have received a content id claim. \nContent id is a system that identifies potential copyrighted content in videos by scanning uploaded videos for previously-uploaded content. The system will scan for audio, video, and images.\nChoose your answer from:\na). Answer as many questions as you can. If you are currently having trouble out of confusion or have not answered a question assigned by your user, then this is a great way to check for copyright infringement.\nb). If a match occurs, the original owner is notified and a content id claim is filed. The original owner can choose to do nothing, mute their audio in your video, block the video from being viewed, monetize the video, or track the video's viewership.\nc). Figure out whether you have used the content id app. Copyright-infringement lawsuits can take up to a few years to fully prove that you use the content id app in your videos.\nd). If you have uploaded video videos to youtube, it will scan for audio, video, and images in your channel. You may receive your content id when you sign up for a video profile.\n\nAssistant: b). If a match occurs, the original owner is notified and a content id claim is filed. The original owner can choose to do nothing, mute their audio in your video, block the video from being viewed, monetize the video, or track the video's viewership.\nThis is the correct answer because it describes the next logical event in the process of handling a content id claim on YouTube. When a match is found, the original owner gets to decide what to do with the claimed content.\na) is wrong because it talks about answering questions and checking for copyright infringement, which is not directly related to the next step in unblocking copyright infringement on YouTube.\nc) is incorrect because it discusses proving the use of the content id app in lawsuits, which is not the next step in dealing with a content id claim on YouTube.\nd) is wrong because it talks about uploading videos to YouTube and receiving content id, but not about the next step in resolving a content id claim."}
{"text": "System: You are an AI assistant. Provide a detailed answer so user don’t need to search outside to understand the answer.\n\nUser: OPTIONS:\n[i] Yes;\n[ii] No;\nGiven those answer options, answer the question: Title: Good book with a bad description of the product Review: Although this looks to be a great book, I was disappointed to see the giant inscription on the first page of the book. Because this book was for a gift, noting this inscription on the product description would have been very helpful Is this product review negative?\nA:\n\nAssistant: Yes, this product review is negative. The reviewer expresses disappointment with the product due to the giant inscription on the first page, which was not mentioned in the product description. They also mention that the book was intended as a gift, which further explains their dissatisfaction with the unexpected inscription. Although the reviewer acknowledges that the book might be great, their overall experience with the product seems to be negative due to this issue."}
{"text": "System: You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.\n\nUser: Write some highlights for the following article:\nThe historic six-month agreement over Tehran's nuclear program may begin a new era of relations with Iran, but it will be a long road back for the country's most vital sector, oil. Iran produces about two and half million barrels a day -- far off its 4-million-barrel-per-day peak a decade ago.  Output is hovering at a level last seen at the end of Iran's war with Iraq. With North Sea Brent crude averaging over $100 a barrel for a record three years running, the sanctions on energy alone are costing Tehran about $50 billion in lost annual revenue. Despite the deal breakthrough, U.S. Secretary of State John Kerry said most of the sanctions will stick as the world gauges the intentions of this relatively new administration in Tehran. \"During the six month phase, the oil sanctions that will remain in place will continue to cause over $25 billion in lost revenues to Iran or over $4 billion a month,\" he said. The message is clear:  the pressure remains, but if all goes well, in a half year's time Iran can expect more in return for transparency. The last few years have been filled with uncertainty. Tensions around the Strait of Hormuz with the on-and-off threats by the previous government of Mahmoud Ahmadinejad to block oil shipments in the Gulf have kept what strategists call a 10-15% risk premium on global energy prices. Iran sits on about 9% of the world's proven oil reserves, claiming a few years back that it has nearly 150 billion barrels and the world's largest gas field. But its top four customers --China, India, Japan and South Korea -- have all had to cut back their energy imports by a third or more in the past few years due to U.S. and European pressure. With every year that has passed, the screws have been tightened by Washington and the countries of the European Union.  It was not only sanctions against oil, but also blocking Iran's ability to secure shipping insurance and to trade in U.S. dollars and euros.  That economic isolation, many Middle East strategists I have spoken with suggest, is what brought Iran's new government to the negotiating table. In 2012, the Iranian rial plunged by up to 80%.  Basic staples of life, poultry and bread skyrocketed in price and the economy went through a period of hyper-inflation. With rising import prices due to a plummeting currency, industrial Iran is no longer able to compete.  There are reports that the state manufacturing sector had laid off up to 800,000 workers in 2012 and those who have kept their jobs saw their wages eroded by skyrocketing prices. Oil executives with experience in the country say this initial agreement would help lift a cloud of uncertainty over the oil market, but that President Hassan Rouhani and his cabinet have to establish much better contract terms if this honeymoon period lifts sanctions. The blunt-speaking Chief Executive of French energy group Total, Christophe de Margerie, told me at an energy conference in Abu Dhabi this month that Iran needs to try and create a better climate for investment if this weekend's breakthrough is sustained. \"I can tell you with the experience we have from Iran it doesn't always bring as we say in French partie de plaisir ... a win-win.\" Oil giant Saudi Arabia has expressed doubts about signing a deal with Tehran, which will introduce more challenges within OPEC.  Iraq plans to double production by 2020 to six million barrels a day and with Iran wanting to rebuild exports, the Kingdom may have trim its own production to defend prices. It is still early days, but this country of nearly 80 million people has been described as potentially being the Germany of the Middle East with plenty of natural resources -- that is, if it can emerge from years of economic isolation.\nHighlights:\n\nAssistant: 1. Historic six-month agreement over Tehran's nuclear program presents an opportunity for Iran's oil sector recovery, but the journey will be long and difficult.\n2. Iran's oil production is currently around 2.5 million barrels per day, significantly less than its peak of 4 million barrels per day a decade ago.\n3. Sanctions on energy alone are costing Tehran approximately $50 billion in lost annual revenue.\n4. U.S. Secretary of State John Kerry said most sanctions will remain in place during the six-month phase, causing Iran to lose over $25 billion in revenues.\n5. Iran's top four oil customers, China, India, Japan, and South Korea, have had to cut back their imports by a third or more due to U.S. and European pressure.\n6. Iran's economy suffered greatly during the period of sanctions, with rampant inflation and industrial layoffs.\n7. Establishing better contract terms is crucial for Iran to lift sanctions and improve the investment climate.\n8. Oil giant Saudi Arabia has doubts about signing a deal with Tehran, introducing challenges within OPEC for Iran's oil output growth."}

将 jsonl 文件转化为 binidx 文件

在 RWKV-LM/RWKV-v5 目录中运行以下命令,可以将 data 文件夹中的 jsonl 文件转成 binidx 文件 :

python3 tools/preprocess_data.py --input ./data/sample.jsonl --output-prefix ./data/sample --vocab ./rwkv_vocab_v20230424.txt --dataset-impl mmap --tokenizer-type RWKVTokenizer --append-eod

full-ft-simple-jsonl-to-binidx

如果你在使用自己的 jsonl 数据集,这条命令有两个需要修改的参数:

--input ./data/sample.jsonl # 作为输入的 jsonl 文件路径
--output-prefix ./data/sample # 输出的 bin 和 idx 文件路径

sample.jsonl 改成你自己的 jsonl 文件名称,其他参数保持默认即可。

出现以下提示,意味着转换已经完成:

full-ft-simple-jsonl-to-binidx-2

修改训练参数

使用任意文本编辑器(如 vscode)打开 demo-training-run.sh 文件,可以调整训练参数,进而控制微调的训练过程和训练效果。:

full-ft-simple-modify-parameters

以下是一次调参过程的记录:

调整路径参数

demo-training-run.sh 文件前三行是文件路径参数:

  • load_model: 基底 RWKV 模型的路径
  • proj_dir:训练日志和模型文件的输出路径
  • data_file:训练数据集的路径,注意路径中不需要带 bin 和 idx 后缀,仅需文件名称。

调整 n_layer 和 n_embd 参数

基底 RWKV 模型的参数大小不同,训练时使用的 n_layer 和 n_embd 数值也不一样。以下是不同参数的 RWKV 模型和 n_layer 和 n_embd 数值的对照列表:

模型参数n_layern_embd
0.1B12768
0.4B241024
1.5B242048
3B322560
7B324096
14B614096

调整重要训练参数

以下参数建议根据你的微调数据、设备性能进行调整。

参数描述
micro_bsz=1根据显存大小调整,微调时从 1 开始尝试增大
epoch_save=5每隔多少个训练轮次保存一次训练模型,调小需要注意存储空间是否充足
epoch_steps=1000每个训练轮次的步数,增加会拉长单个epoch的训练时间
ctx_len=512微调模型的上下文长度,根据你的需求而定

调整其他训练参数

下面列出了脚本中其他可修改的训练参数,及其修改的效果。

参数描述
--data_type binidx训练语料的文件格式,支持:"utf-8", "utf-16le", "numpy", "binidx", "dummy", "wds_img", "uint16"
--vocab_size 65536词表大小,默认为 65536,设为 0 则模型自动确定词汇表大小
--epoch_count 1总训练轮次
--epoch_begin 0初始训练轮次,从第 N 个 epoch 开始加载
--pre_ffn 0将第一个 att 层替换为 ffn,通常保持关闭
--head_qk 0保持默认值 0(关闭状态)
--lr_init 1e-5初始学习率,微调建议 1e-5,可适当增加至不超过 5e-5
--lr_final 1e-5最终学习率,微调同样建议 1e-5,熟练后可自定义
--warmup_steps 0预热步骤,微调初学者建议 0;若加载模型,可尝试 50
--beta1 0.9Adam 优化器 beta1 参数
--beta2 0.99Adam 优化器 beta2 参数
--adam_eps 1e-8Adam 优化器的 epsilon 参数
--accelerator gpu加速器类型,推荐使用 gpu
--devices 1单显卡填 1,多卡按实际数量
--precision bf16训练精度,默认为 bf16,也支持 "fp32", "tf32", "fp16"
--strategy deepspeed_stage_1训练策略,微调推荐 deepspeed_stage_1
--grad_cp 1梯度累积步数,0 加快训练但消耗更多显存,1 反之
--my_testing "x060"训练的 RWKV 模型版本,v5 选 x052,v6 选 x060

参数调整完成后,请记得保存 demo-training-run.sh 文件。

开始训练

在 RWKV-V5 目录中,使用以下命令运行 demo-training-run.sh 脚本:

sh demo-training-run.sh

成功运行后,应该可以看到类似于这样的效果:

full-ft-simple-training-successful

这份文档对您有帮助吗?
联系我们© 2026 RWKV. All rights reserved.粤ICP备2024242518号-1