C# Word 文档识别上标和下标：完整指南

在 C# 中，可以使用 Microsoft.Office.Interop.Word 库来操作 Word 文档。在文档中识别上标和下标，可以通过以下步骤实现：

打开 Word 文档并获取 Document 对象：

using Microsoft.Office.Interop.Word;

Application wordApp = new Application();
Document doc = wordApp.Documents.Open('C:\test.docx');

遍历文档中的所有内容，搜索上标和下标：

foreach (Range range in doc.StoryRanges)
{
    Find find = range.Find;
    find.ClearFormatting();
    find.Replacement.ClearFormatting();
    
    // 搜索上标
    find.Text = '^p^s';
    find.Replacement.Text = '[sup]';
    find.Forward = true;
    find.Wrap = WdFindWrap.wdFindStop;
    find.Format = true;
    find.MatchCase = false;
    find.MatchWholeWord = false;
    find.Execute(Replace: WdReplace.wdReplaceAll);
    
    // 搜索下标
    find.Text = '^p^t';
    find.Replacement.Text = '[sub]';
    find.Execute(Replace: WdReplace.wdReplaceAll);
}

根据需要处理识别出的上标和下标，例如将其转换为 HTML 标签：

foreach (Range range in doc.StoryRanges)
{
    if (range.Text.Contains('[sup]'))
    {
        range.Text = range.Text.Replace('[sup]', '<sup>') + '</sup>';
    }
    
    if (range.Text.Contains('[sub]'))
    {
        range.Text = range.Text.Replace('[sub]', '<sub>') + '</sub>';
    }
}

保存并关闭 Word 文档：

 doc.Save();
doc.Close();

以上是一个简单的示例，具体实现可能需要根据具体需求进行调整。同时需要注意的是，使用 Office.Interop 库需要安装 Microsoft Office 软件，并且不支持在非 Windows 平台上运行。