Q1:关于“标签PDF文件(Tagged PDF)”
标签PDF文件包含描述文档结构和各种文档元素顺序的元数据,是一种包含后端提供的可访问标记,管理阅读顺序和文档内容表示的逻辑结构的PDF文件[1]。
Q2:关于“标签(Tag)”
PDF标签是通过屏幕阅读器等支持技术访问PDF文档内容的关键。PDF标记在层次结构或标记树(tag tree)中排列PDF内容[1]。
这里的标签是一种不可见的标签,它提供关于PDF文档内容的重要信息。带标签的PDF包含许多不同类型的标签,但最常用的是文本、替代文本(图像的替代文本)、标题、链接和链接描述[2]。
Q3:PDF标签的用处及意义
添加PDF标签不会改变文档的视觉外观,但它提供了一个不可见的层,用于格式化文档与屏幕阅读器协作工作,这就使得从PDF文件中提取文本和图形变得更容易,并帮助屏幕阅读器以正确的顺序显示文件内容。[2]
PDF标签还可以用于将内容传输到屏幕较小的设备,如智能手机和平板电脑。[2]
Q4:如何创建标签PDF文件
本文将要介绍的创建方法是以后端C#程序代码的方式来创建标签PDF文件。创建时,通过NuGet安装引用PDF API-Spire.PDF for .NET,调用其提供的类及相关方法来标记内容、结构元素等。
C#- using Spire.Pdf;
- using Spire.Pdf.Graphics;
- using Spire.Pdf.Interchange.TaggedPdf;
- using System.Drawing;
- namespace CreateTaggedPDF
- {
- class Program
- {
- static void Main(string[] args)
- {
- //创建PdfDocument类的对象
- PdfDocument pdf = new PdfDocument();
- //添加一页
- pdf.Pages.Add(PdfPageSize.A4);
- //设置tab order
- pdf.Pages[0].SetTabOrder(TabOrder.Structure);
- //创建PdfTaggedContent类的对象
- PdfTaggedContent taggedContent = new PdfTaggedContent(pdf);
- taggedContent.SetLanguage("en-US");
- taggedContent.SetTitle("test");
- //创建字体、画刷、字符串格式
- PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Times New Roman", 10), true);
- PdfSolidBrush brush = new PdfSolidBrush(Color.Black);
- PdfStringFormat format = new PdfStringFormat(PdfTextAlignment.Left);
- //添加elements
- PdfStructureElement article = taggedContent.StructureTreeRoot.AppendChildElement(PdfStandardStructTypes.Document);
- PdfStructureElement paragraph1 = article.AppendChildElement(PdfStandardStructTypes.Paragraph);
- PdfStructureElement span1 = paragraph1.AppendChildElement(PdfStandardStructTypes.Span);
- span1.BeginMarkedContent(pdf.Pages[0]);
- //绘制内容到页面
- pdf.Pages[0].Canvas.DrawString("A PDF tag is the key to accessing the contents of PDF documents with supporting technologies such as screen readers. ", font, brush, new Rectangle(40, 0, 480, 80), format);
- span1.EndMarkedContent(pdf.Pages[0]);
- PdfStructureElement paragraph2 = article.AppendChildElement(PdfStandardStructTypes.Paragraph);
- paragraph2.BeginMarkedContent(pdf.Pages[0]);
- pdf.Pages[0].Canvas.DrawString("A PDF tag arranges the PDF content in a hierarchical architecture or tag tree.", font, brush, new Rectangle(40, 80, 480, 80), format);
- paragraph2.EndMarkedContent(pdf.Pages[0]);
- PdfStructureElement figure1 = article.AppendChildElement(PdfStandardStructTypes.Figure);
- //Set Alternate text
- figure1.Alt = "replacement text1";
- figure1.BeginMarkedContent(pdf.Pages[0], null);
- PdfImage image = PdfImage.FromFile(@"logo.png");
- pdf.Pages[0].Canvas.DrawImage(image, new PointF(40, 200), new SizeF(100, 100));//绘制图片到页面
- figure1.EndMarkedContent(pdf.Pages[0]);
- PdfStructureElement figure2 = article.AppendChildElement(PdfStandardStructTypes.Figure);
- //Set Alternate text
- figure2.Alt = "replacement text2";
- figure2.BeginMarkedContent(pdf.Pages[0], null);
- pdf.Pages[0].Canvas.DrawRectangle(PdfPens.Black, new Rectangle(300, 200, 100, 100));
- figure2.EndMarkedContent(pdf.Pages[0]);
- //保存文档
- pdf.SaveToFile("CreateTaggedFile_result.pdf");
- }
- }
- }
复制代码 vb.net- Imports Spire.Pdf
- Imports Spire.Pdf.Graphics
- Imports Spire.Pdf.Interchange.TaggedPdf
- Imports System.Drawing
- Namespace CreateTaggedPDF
- Class Program
- Private Shared Sub Main(args As String())
- '创建PdfDocument类的对象
- Dim pdf As New PdfDocument()
- '添加一页
- pdf.Pages.Add(PdfPageSize.A4)
- '设置tab order
- pdf.Pages(0).SetTabOrder(TabOrder.[Structure])
- '创建PdfTaggedContent类的对象
- Dim taggedContent As New PdfTaggedContent(pdf)
- taggedContent.SetLanguage("en-US")
- taggedContent.SetTitle("test")
- '创建字体、画刷、字符串格式
- Dim font As New PdfTrueTypeFont(New Font("Times New Roman", 10), True)
- Dim brush As New PdfSolidBrush(Color.Black)
- Dim format As New PdfStringFormat(PdfTextAlignment.Left)
- '添加elements
- Dim article As PdfStructureElement = taggedContent.StructureTreeRoot.AppendChildElement(PdfStandardStructTypes.Document)
- Dim paragraph1 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Paragraph)
- Dim span1 As PdfStructureElement = paragraph1.AppendChildElement(PdfStandardStructTypes.Span)
- span1.BeginMarkedContent(pdf.Pages(0))
- '绘制内容到页面
- pdf.Pages(0).Canvas.DrawString("A PDF tag is the key to accessing the contents of PDF documents with supporting technologies such as screen readers. ", font, brush, New Rectangle(40, 0, 480, 80), format)
- span1.EndMarkedContent(pdf.Pages(0))
- Dim paragraph2 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Paragraph)
- paragraph2.BeginMarkedContent(pdf.Pages(0))
- pdf.Pages(0).Canvas.DrawString("A PDF tag arranges the PDF content in a hierarchical architecture or tag tree.", font, brush, New Rectangle(40, 80, 480, 80), format)
- paragraph2.EndMarkedContent(pdf.Pages(0))
- Dim figure1 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Figure)
- 'Set Alternate text
- figure1.Alt = "replacement text1"
- figure1.BeginMarkedContent(pdf.Pages(0), Nothing)
- Dim image As PdfImage = PdfImage.FromFile("logo.png")
- pdf.Pages(0).Canvas.DrawImage(image, New PointF(40, 200), New SizeF(100, 100))
- '绘制图片到页面
- figure1.EndMarkedContent(pdf.Pages(0))
- Dim figure2 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Figure)
- 'Set Alternate text
- figure2.Alt = "replacement text2"
- figure2.BeginMarkedContent(pdf.Pages(0), Nothing)
- pdf.Pages(0).Canvas.DrawRectangle(PdfPens.Black, New Rectangle(300, 200, 100, 100))
- figure2.EndMarkedContent(pdf.Pages(0))
- '保存文档
- pdf.SaveToFile("CreateTaggedFile_result.pdf")
- System.Diagnostics.Process.Start("CreateTaggedFile_result.pdf")
- End Sub
- End Class
- End Namespace
复制代码
参考资料:
[1]. https://247accessibledocuments.com/what-is-a-tagged-pdf/
[2]. https://accessibility-i.org/what-is-a-tagged-pdf/
—END—
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! |