fix: translate remaining Chinese in sim config and profile prompts

- Time config: translate all Chinese instructions and field descriptions
- Event config: translate hot topics/narrative direction instructions
- Agent config: translate entity type descriptions and field labels
- Profile generator: translate all persona prompt fields and instructions
- Country field: changed from 'use Chinese' to 'use English'
This commit is contained in:
Kunthawat Greethong
2026-06-17 19:36:05 +07:00
parent 5fcce79361
commit 431b66fd85
2 changed files with 102 additions and 102 deletions

View File

@@ -684,43 +684,43 @@ class OasisProfileGenerator:
) -> str: ) -> str:
"""构建个人实体的详细人设提示词""" """构建个人实体的详细人设提示词"""
attrs_str = json.dumps(entity_attributes, ensure_ascii=False) if entity_attributes else "" attrs_str = json.dumps(entity_attributes, ensure_ascii=False) if entity_attributes else "None"
context_str = context[:3000] if context else "无额外上下文" context_str = context[:3000] if context else "No additional context"
return f"""Generate detailed social media user persona for the entity, maximizing fidelity to real-world situations. return f"""Generate detailed social media user persona for the entity, maximizing fidelity to real-world situations.
实体名称: {entity_name} Entity name: {entity_name}
实体类型: {entity_type} Entity type: {entity_type}
实体摘要: {entity_summary} Entity summary: {entity_summary}
实体属性: {attrs_str} Entity attributes: {attrs_str}
上下文信息: Context information:
{context_str} {context_str}
请生成JSON包含以下字段: Please generate JSON with the following fields:
1. bio: 社交媒体简介200字 1. bio: Social media bio, 200 characters
2. persona: 详细人设描述2000字的纯文本需包含: 2. persona: Detailed persona description (2000 chars plain text), must include:
- 基本信息(年龄、职业、教育背景、所在地) - Basic info (age, occupation, education, location)
- 人物背景(重要经历、与事件的关联、社会关系) - Background (important experiences, connection to event, social relations)
- 性格特征MBTI类型、核心性格、情绪表达方式 - Personality (MBTI type, core personality, emotional expression style)
- 社交媒体行为(发帖频率、内容偏好、互动风格、语言特点) - Social media behavior (posting frequency, content preferences, interaction style, language traits)
- 立场观点(对话题的态度、可能被激怒/感动的内容) - Stance (attitude toward topics, content that may trigger/move them)
- 独特特征(口头禅、特殊经历、个人爱好) - Unique traits (catchphrases, special experiences, personal hobbies)
- 个人记忆(人设的重要部分,要介绍这个个体与事件的关联,以及这个个体在事件中的已有动作与反应) - Personal memory (important part of persona: individual's connection to the event, and their existing actions and reactions)
3. age: 年龄数字(必须是整数) 3. age: Age number (must be integer)
4. gender: 性别,必须是英文: "male" "female" 4. gender: Gender, must be English: "male" or "female"
5. mbti: MBTI类型(如INTJENFP等) 5. mbti: MBTI type (e.g. INTJ, ENFP)
6. country: 国家(使用中文,如"中国" 6. country: Country (use English, e.g. "Thailand", "China")
7. profession: 职业 7. profession: Occupation
8. interested_topics: 感兴趣话题数组 8. interested_topics: Array of topics of interest
重要: Important:
- 所有字段值必须是字符串或数字,不要使用换行符 - All field values must be strings or numbers, do not use newlines
- persona必须是一段连贯的文字描述 - persona must be a coherent text description
- {get_language_instruction()} (gender字段必须用英文male/female) - {get_language_instruction()} (gender字段必须用英文male/female)
- 内容要与实体信息保持一致 - Content must be consistent with entity information
- age必须是有效的整数gender必须是"male""female" - age must be a valid integer, gender must be "male" or "female"
""" """
def _build_group_persona_prompt( def _build_group_persona_prompt(
@@ -733,42 +733,42 @@ class OasisProfileGenerator:
) -> str: ) -> str:
"""构建群体/机构实体的详细人设提示词""" """构建群体/机构实体的详细人设提示词"""
attrs_str = json.dumps(entity_attributes, ensure_ascii=False) if entity_attributes else "" attrs_str = json.dumps(entity_attributes, ensure_ascii=False) if entity_attributes else "None"
context_str = context[:3000] if context else "无额外上下文" context_str = context[:3000] if context else "No additional context"
return f"""Generate detailed social media account settings for organization/group entity, maximizing fidelity to real-world situations. return f"""Generate detailed social media account settings for organization/group entity, maximizing fidelity to real-world situations.
实体名称: {entity_name} Entity name: {entity_name}
实体类型: {entity_type} Entity type: {entity_type}
实体摘要: {entity_summary} Entity summary: {entity_summary}
实体属性: {attrs_str} Entity attributes: {attrs_str}
上下文信息: Context information:
{context_str} {context_str}
请生成JSON包含以下字段: Please generate JSON with the following fields:
1. bio: 官方账号简介200字专业得体 1. bio: Official account bio, 200 characters, professional and appropriate
2. persona: 详细账号设定描述2000字的纯文本需包含: 2. persona: Detailed account settings description (2000 chars plain text), must include:
- 机构基本信息(正式名称、机构性质、成立背景、主要职能) - Organization basics (official name, nature, founding background, main functions)
- 账号定位(账号类型、目标受众、核心功能) - Account positioning (account type, target audience, core function)
- 发言风格(语言特点、常用表达、禁忌话题) - Speaking style (language traits, common expressions, taboo topics)
- 发布内容特点(内容类型、发布频率、活跃时间段) - Content characteristics (content type, posting frequency, active hours)
- 立场态度(对核心话题的官方立场、面对争议的处理方式) - Stance (official position on core topics, handling of controversies)
- 特殊说明(代表的群体画像、运营习惯) - Special notes (represented group profile, operational habits)
- 机构记忆(机构人设的重要部分,要介绍这个机构与事件的关联,以及这个机构在事件中的已有动作与反应) - Organization memory (important part: organization's connection to the event, and their existing actions and reactions)
3. age: 固定填30机构账号的虚拟年龄 3. age: Always 30 (virtual age for organization accounts)
4. gender: 固定填"other"机构账号使用other表示非个人 4. gender: Always "other" (organization accounts use "other" to indicate non-personal)
5. mbti: MBTI类型用于描述账号风格如ISTJ代表严谨保守 5. mbti: MBTI type to describe account style, e.g. ISTJ for rigorous and conservative
6. country: 国家(使用中文,如"中国" 6. country: Country (use English, e.g. "Thailand", "China")
7. profession: 机构职能描述 7. profession: Organization function description
8. interested_topics: 关注领域数组 8. interested_topics: Array of areas of interest
重要: Important:
- 所有字段值必须是字符串或数字不允许null值 - All field values must be strings or numbers, null values not allowed
- persona必须是一段连贯的文字描述,不要使用换行符 - persona must be a coherent text description,不要使用换行符
- {get_language_instruction()} (gender字段必须用英文"other") - {get_language_instruction()} (gender字段必须用英文"other")
- age必须是整数30gender必须是字符串"other" - age must be integer 30, gender must be string "other"
- Organization account posts must match its identity positioning""" - Organization account posts must match its identity positioning"""
def _generate_profile_rule_based( def _generate_profile_rule_based(

View File

@@ -544,24 +544,24 @@ class SimulationConfigGenerator:
{context_truncated} {context_truncated}
## 任务 ## Task
Please generate time config JSON. Please generate time config JSON.
### 基本原则(仅供参考,需根据具体事件和参与群体灵活调整): ### Basic Principles (for reference only, adjust flexibly based on specific event and participant group):
- Please infer the target user group's timezone and daily routine based on the simulation scenario. Below is a reference example for UTC+8 - Please infer the target user group's timezone and daily routine based on the simulation scenario. Below is a reference example for UTC+8
- 凌晨0-5点几乎无人活动活跃度系数0.05 - Midnight 0-5am: almost no activity (activity multiplier 0.05)
- 早上6-8点逐渐活跃活跃度系数0.4 - Morning 6-8am: gradually active (activity multiplier 0.4)
- 工作时间9-18点中等活跃活跃度系数0.7 - Work hours 9-18pm: moderately active (activity multiplier 0.7)
- 晚间19-22点是高峰期活跃度系数1.5 - Evening 19-22pm: peak hours (activity multiplier 1.5)
- 23点后活跃度下降活跃度系数0.5 - After 23pm: activity decreases (activity multiplier 0.5)
- 一般规律:凌晨低活跃、早间渐增、工作时段中等、晚间高峰 - General pattern: low activity late night, increasing morning, moderate during work hours, peak in evening
- **重要**:以下示例值仅供参考,你需要根据事件性质、参与群体特点来调整具体时段 - **Important**: The example values above are for reference only. You must adjust specific time periods based on event nature and participant characteristics
- 例如学生群体高峰可能是21-23点媒体全天活跃官方机构只在工作时间 - Example: Student groups may peak at 21-23pm; media is active all day; government agencies only during work hours
- 例如突发热点可能导致深夜也有讨论off_peak_hours 可适当缩短 - Example: Breaking hot topics may cause late-night discussions, off_peak_hours can be shortened accordingly
### 返回JSON格式不要markdown ### Return in JSON format (no markdown)
示例: Example:
{{ {{
"total_simulation_hours": 72, "total_simulation_hours": 72,
"minutes_per_round": 60, "minutes_per_round": 60,
@@ -574,15 +574,15 @@ Please generate time config JSON.
"reasoning": "Explanation of time config for this event" "reasoning": "Explanation of time config for this event"
}} }}
字段说明: Field descriptions:
- total_simulation_hours (int): 模拟总时长24-168小时突发事件短、持续话题长 - total_simulation_hours (int): Total simulation duration, 24-168 hours, shorter for breaking events, longer for ongoing topics
- minutes_per_round (int): 每轮时长30-120分钟建议60分钟 - minutes_per_round (int): Minutes per round, 30-120 minutes, recommended 60 minutes
- agents_per_hour_min (int): 每小时最少激活Agent数取值范围: 1-{max_agents_allowed} - agents_per_hour_min (int): Minimum agents activated per hour (range: 1-{max_agents_allowed})
- agents_per_hour_max (int): 每小时最多激活Agent数取值范围: 1-{max_agents_allowed} - agents_per_hour_max (int): Maximum agents activated per hour (range: 1-{max_agents_allowed})
- peak_hours (int数组): 高峰时段,根据事件参与群体调整 - peak_hours (int array): Peak hours, adjust based on event participant group
- off_peak_hours (int数组): 低谷时段,通常深夜凌晨 - off_peak_hours (int array): Off-peak hours, typically late night/early morning
- morning_hours (int数组): 早间时段 - morning_hours (int array): Morning hours
- work_hours (int数组): 工作时段 - work_hours (int array): Work hours
- reasoning (string): Brief explanation of why this config was chosen""" - reasoning (string): Brief explanation of why this config was chosen"""
system_prompt = "You are a social media simulation expert. Return pure JSON format. Time config must match the target user group's daily routine in the simulation scenario." system_prompt = "You are a social media simulation expert. Return pure JSON format. Time config must match the target user group's daily routine in the simulation scenario."
@@ -679,27 +679,27 @@ Simulation requirements: {simulation_requirement}
{context_truncated} {context_truncated}
## 可用实体类型及示例 ## Available Entity Types and Examples
{type_info} {type_info}
## 任务 ## Task
Please generate event config JSON: Please generate event config JSON:
- 提取热点话题关键词 - Extract hot topic keywords
- 描述舆论发展方向 - Describe the narrative direction of public opinion
- 设计初始帖子内容,**每个帖子必须指定 poster_type发布者类型** - Design initial post content, **each post must specify poster_type (publisher type)**
**重要**: poster_type 必须从上面的"可用实体类型"中选择,这样初始帖子才能分配给合适的 Agent 发布。 **Important**: poster_type must be selected from the "Available Entity Types" above, so initial posts can be assigned to suitable Agents.
例如:官方声明应由 Official/University 类型发布,新闻由 MediaOutlet 发布,学生观点由 Student 发布。 For example: Official statements should be posted by Official/University types, news by MediaOutlet, student opinions by Student.
Return in JSON format (no markdown): Return in JSON format (no markdown):
{{ {{
"hot_topics": ["关键词1", "关键词2", ...], "hot_topics": ["keyword1", "keyword2", ...],
"narrative_direction": "<舆论发展方向描述>", "narrative_direction": "<description of public opinion direction>",
"initial_posts": [ "initial_posts": [
{{"content": "帖子内容", "poster_type": "实体类型(必须从可用类型中选择)"}}, {{"content": "post content", "poster_type": "entity type (must be from available types)"}},
... ...
], ],
"reasoning": "<简要说明>" "reasoning": "<brief explanation>"
}}""" }}"""
system_prompt = "You are a public opinion analysis expert. Return pure JSON format. Note: poster_type must exactly match available entity types." system_prompt = "You are a public opinion analysis expert. Return pure JSON format. Note: poster_type must exactly match available entity types."
@@ -834,33 +834,33 @@ Return in JSON format (no markdown):
Simulation requirements: {simulation_requirement} Simulation requirements: {simulation_requirement}
## 实体列表 ## Entity List
```json ```json
{json.dumps(entity_list, ensure_ascii=False, indent=2)} {json.dumps(entity_list, ensure_ascii=False, indent=2)}
``` ```
## 任务 ## Task
Generate activity config for each entity, noting: Generate activity config for each entity, noting:
- **Time must match target user group routine**: Below is a reference (UTC+8), please adjust based on simulation scenario - **Time must match target user group routine**: Below is a reference (UTC+8), please adjust based on simulation scenario
- **官方机构**University/GovernmentAgency):活跃度低(0.1-0.3),工作时间(9-17)活动,响应慢(60-240分钟),影响力高(2.5-3.0) - **Government agencies** (University/GovernmentAgency): Low activity (0.1-0.3), active during work hours (9-17), slow response (60-240 min), high influence (2.5-3.0)
- **媒体**MediaOutlet):活跃度中(0.4-0.6),全天活动(8-23),响应快(5-30分钟),影响力高(2.0-2.5) - **Media** (MediaOutlet): Medium activity (0.4-0.6), active all day (8-23), fast response (5-30 min), high influence (2.0-2.5)
- **个人**Student/Person/Alumni):活跃度高(0.6-0.9),主要晚间活动(18-23),响应快(1-15分钟),影响力低(0.8-1.2) - **Individuals** (Student/Person/Alumni): High activity (0.6-0.9), mainly evening (18-23), fast response (1-15 min), low influence (0.8-1.2)
- **公众人物/专家**:活跃度中(0.4-0.6),影响力中高(1.5-2.0) - **Public figures/experts**: Medium activity (0.4-0.6), medium-high influence (1.5-2.0)
Return in JSON format (no markdown): Return in JSON format (no markdown):
{{ {{
"agent_configs": [ "agent_configs": [
{{ {{
"agent_id": <必须与输入一致>, "agent_id": <must match input>,
"activity_level": <0.0-1.0>, "activity_level": <0.0-1.0>,
"posts_per_hour": <发帖频率>, "posts_per_hour": <posting frequency>,
"comments_per_hour": <评论频率>, "comments_per_hour": <commenting frequency>,
"active_hours": [<活跃小时列表,考虑中国人作息>], "active_hours": [<list of active hours, consider target user routine>],
"response_delay_min": <最小响应延迟分钟>, "response_delay_min": <min response delay in minutes>,
"response_delay_max": <最大响应延迟分钟>, "response_delay_max": <max response delay in minutes>,
"sentiment_bias": <-1.01.0>, "sentiment_bias": <-1.0 to 1.0>,
"stance": "<supportive/opposing/neutral/observer>", "stance": "<supportive/opposing/neutral/observer>",
"influence_weight": <影响力权重> "influence_weight": <influence weight>
}}, }},
... ...
] ]