You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(seo): fix Rich Results, trim title/desc, add JSON-LD to index pages
Add description, creator, license to isPartOf nested Dataset on
type/topic pages to pass Google Rich Results validation. Trim
homepage title to 56 chars and description to 150 chars. Add
CollectionPage JSON-LD to /types and /topics index pages.
<title>docx-corpus — The largest open corpus of classified Word documents</title>
15
+
<title>docx-corpus — Open corpus of classified Word documents</title>
16
16
<metaname="description"
17
-
content="736K+ real .docx files from the public web, classified into 10 document types and 9 topics across 46+ languages. The missing dataset for document processing research.">
17
+
content="736K+ real .docx files from the public web, classified into 10 types and 9 topics across 46+ languages. Open dataset for document processing research.">
18
18
<linkrel="canonical" href="https://docxcorp.us">
19
19
<metaproperty="og:title" content="docx-corpus — Every Word document on the public web. Classified and open.">
"description": "The largest open corpus of classified Word documents. 736K+ real .docx files from the public web, classified into 10 document types and 9 topics across 46+ languages.",
"description": "The largest open corpus of classified Word documents. 736K+ real .docx files from the public web, classified into 10 document types and 9 topics across 46+ languages.",
0 commit comments