Agent Skills
Content Proofreading
AIPOCH
An academic proofreading skill for Chinese/English manuscripts, triggered when you need automated checks for spelling, grammar, terminology consistency, and formatting before submission.
2
0
FILES
88100Total Score
View Evaluation ReportCore Capability
79 / 100
Functional Suitability
10 / 12
Reliability
9 / 12
Performance & Context
8 / 8
Agent Usability
12 / 16
Human Usability
7 / 8
Security
9 / 12
Maintainability
9 / 12
Agent-Specific
15 / 20
Medical Task
20 / 20 Passed
98You are preparing an academic paper for journal/conference submission and need a final language + formatting pass
4/4
94You have bilingual (Chinese/English) content and want consistent punctuation, wording, and style across both languages
4/4
92English checks
4/4
92Spelling (including US/UK variants)
4/4
92End-to-end case for English checks
4/4
SKILL.md
When to Use
- You are preparing an academic paper for journal/conference submission and need a final language + formatting pass.
- You have bilingual (Chinese/English) content and want consistent punctuation, wording, and style across both languages.
- Your manuscript contains domain terminology (e.g., life sciences) and you need consistent Chinese–English term mapping and abbreviation rules.
- You need to validate references, numbers/units, and heading levels against a required style (APA/MLA/GB/T 7714).
- You want a shareable report (HTML or Markdown annotations) with precise error locations and revision suggestions.
Key Features
-
English checks
- Spelling (including US/UK variants)
- Grammar (agreement, tense, articles, clause structure)
- Punctuation conventions (US/UK)
- Style suggestions (redundancy detection, passive voice optimization)
-
Chinese checks
- Typo/misused character detection (dictionary-based)
- Grammar and collocation checks
- Chinese vs. English punctuation normalization
- Academic expression optimization suggestions
-
Terminology consistency
- Domain terminology database (life sciences by default)
- Bidirectional Chinese–English correspondence checks
- Abbreviation rules (require full form on first occurrence)
- Synonym unification to preferred standard terms
-
Formatting checks
- Reference style validation (APA/MLA/GB/T 7714, etc.)
- Number and unit normalization
- Heading level consistency
- Abbreviation consistency across the document
-
Reporting
- HTML interactive report or Markdown annotations
- Precise error localization
- Actionable revision suggestions
Dependencies
-
Python:
>= 3.8 -
Python packages (install via
pip install -r requirements.txt)languagetool-python(version: seerequirements.txt) — English grammar checkingopencc(version: seerequirements.txt) — Traditional/Simplified Chinese conversionjieba(version: seerequirements.txt) — Chinese tokenizationpyenchant(version: seerequirements.txt) — spelling checksmarkdown(version: seerequirements.txt) — Markdown renderingpython-docx(version: seerequirements.txt) —.docxreadingdocx2pdf(version: seerequirements.txt) — Word-to-PDF conversion
Example Usage
1) Install
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
2) Run (basic)
python scripts/init_run.py --input <paper_file_path> --output <output_path>
3) Run (advanced)
python scripts/init_run.py \
--input paper.md \
--output report.html \
--lang en \
--style apa \
--terminology biology \
--format html
4) CLI parameters
| Parameter | Description | Default |
|---|---|---|
--input | Input file path | Required |
--output | Output report path | Generates an HTML report by default |
--lang | Language to check (en / zh / both) | both |
--style | Reference style (apa / mla / gb) | apa |
--terminology | Domain terminology set | biology |
--format | Output format (html / markdown) | html |
--no-pdf | Skip PDF generation during Word→PDF conversion | false |
5) Use as a Python module (end-to-end)
from scripts.english_checker import EnglishChecker
from scripts.chinese_checker import ChineseChecker
from scripts.terminology_manager import TerminologyManager
from scripts.annotation_generator import AnnotationGenerator
text = """
Messenger RNA (mRNA) is transcribed in the nucleus.
"""
en_checker = EnglishChecker()
zh_checker = ChineseChecker()
term_manager = TerminologyManager(domain="biology")
results = []
results.extend(en_checker.check(text))
results.extend(zh_checker.check(text))
results.extend(term_manager.check(text))
generator = AnnotationGenerator(output_format="html")
report = generator.generate(results)
with open("report.html", "w", encoding="utf-8") as f:
f.write(report)
Implementation Details
Architecture / Core Modules
-
english_checker.py- Core engine for English spelling/grammar/style checks.
- Designed to be rule-extensible (add or register new rule sets).
-
chinese_checker.py- Core engine for Chinese typo/grammar/style checks.
- Includes a library of common academic writing error patterns.
-
terminology_manager.py- Terminology database management (import/export/query/update).
- Performs term consistency checks, bilingual mapping validation, and abbreviation policy checks.
-
annotation_generator.py- Converts detected issues into a visual report (HTML) or annotated Markdown.
- Ensures issues include location, type, and suggested fix.
-
word_converter.py- Extracts text from
.docx. - Optionally converts Word to PDF (can be disabled via
--no-pdf).
- Extracts text from
Terminology database format (JSON)
Organized by domain; each entry can include bilingual forms and abbreviation metadata:
{
"biology": {
"cell": {
"en": "cell",
"abbrev": null,
"full_form": null
},
"mrna": {
"en": "mRNA",
"abbrev": "mRNA",
"full_form": "messenger RNA"
}
}
}
Checking logic (typical):
- If an abbreviation (e.g.,
mRNA) appears, verify the full form appears at first mention (e.g.,messenger RNA (mRNA)). - If both Chinese and English terms appear, verify they match the configured mapping for the selected domain.
- If synonyms are detected, prefer the standardized term defined in the database.
Rule database format (JSON)
Rules are grouped by language and category:
{
"english": {
"spelling": [],
"grammar": [],
"style": []
},
"format": {
"references": [],
"numbers": [],
"units": []
}
}
How rules are applied (high level):
- Load rule sets by
--langand--style. - Run language-specific checks (English/Chinese) and formatting checks.
- Merge results into a unified issue list.
- Render issues into the selected output format (
html/markdown) with location-aware annotations.
Extensibility
-
Add new rules
- Create a rule file under
assets/rules/. - Implement rules following the project’s rule template.
- Register the rule set in the rule index.
- Run tests to validate precision/recall and avoid false positives.
- Create a rule file under
-
Add new terminology sets
- Create a terminology JSON under
assets/terminology/. - Follow the domain structure shown above.
- Register the new domain in the terminology index so it can be selected via
--terminology.
- Create a terminology JSON under