What Is Rich Text? How It Works in a Headless CMS

Rich text is any text that carries formatting beyond plain characters—bold, italics, headings, links, images, tables, and more. The term shows up in two contexts that are easy to conflate: Rich Text Format (RTF), a specific file type Microsoft created in 1987, and the broader idea of rich text as styled, structured content inside a content management system (CMS).

Both matter. The first shaped how we think about portable formatted documents. The second shapes how modern content teams create, store, and deliver content across channels. This article explains what rich text format is, how it works, where it falls short—and how the concept has evolved in headless CMS platforms like dotCMS.

What Is Rich Text Format?

Rich Text Format (RTF) is a proprietary document file format developed by Microsoft for cross-platform document interchange. First released in 1987, RTF was designed to solve a specific problem: letting users move formatted documents between different word processors—across Windows, macOS, and Linux—without losing styling.

RTF files (extension: .rtf) encode formatting as plain-text control words rather than binary data. When you open an .rtf file in any compatible word processor, the control words tell the application how to render bold, italics, fonts, colors, paragraph alignment, tables, and embedded images. This plain-text backbone makes RTF more inspectable and less prone to corruption than binary formats like the older .doc.

Microsoft published the final version of the RTF specification—version 1.9.1—in 2008 and has not updated it since. Features introduced in Word 2010 and later do not save correctly to RTF. The format is effectively frozen.

What Is the Difference Between Rich Text and Plain Text?

Plain text is raw characters (letters, numbers, symbols) with no formatting instructions. A .txt file is plain text. It is universally readable, small in file size, and completely unstyled. Every character is treated equally.

Rich text adds formatting and structure on top of those characters: bold, italics, headings, lists, links, images, colors, and tables. Where plain text treats a heading and a paragraph identically, rich text distinguishes between them—giving content visual hierarchy and semantic meaning.

In a CMS, this distinction plays out practically. Plain text fields work for slugs, labels, meta descriptions, and other short-form data. Rich text fields hold the formatted content that users actually read—blog posts, product descriptions, help articles, policy documents.

Where Is Rich Text Format Used?

Rich text, both the .rtf file format and the broader concept of styled text, appears across several environments:

Word Processors

Word processors like Microsoft Word, LibreOffice, and Google Docs have historically supported RTF as an exchange format. Users save or export documents as .rtf to preserve styling when sharing files across applications and operating systems. Common use cases include sharing formatted documents by email, archiving simple formatted text, and avoiding vendor-specific formats when the recipient’s software is unknown.

That said, RTF’s role in word processing has shrunk. DOCX and cloud-native formats handle collaboration, comments, advanced layouts, and version history—none of which RTF supports. Most teams use RTF only when compatibility with very old systems is a requirement.

Content Management Systems

In most CMS platforms, “rich text” refers to the editing experience—a WYSIWYG (“What You See Is What You Get”) or block editor—rather than the .rtf file format itself. Most CMSs store rich text as HTML or a structured format like JSON, then render it on web pages.

Editors use rich text fields to add headings, links, lists, and basic styling without writing code. The tradeoff: HTML-based rich text can introduce inconsistent spacing, inline styles, and formatting quirks that behave unpredictably across devices and channels.

Headless CMS

A headless CMS decouples content storage from presentation. Content is stored in a backend, managed through an editorial interface, and delivered via APIs to separate front ends, e.g. websites, mobile apps, kiosks, digital signage, and AI agents.

Rich text in a headless CMS typically gets stored as structured JSON or HTML that front ends parse and render independently. By treating content as data with predictable, restricted formatting, headless CMS rich text can be delivered consistently across channels without manual reformatting for each destination.

This requires collaboration between editors and developers. Editors use content models to define where rich text is allowed and what formatting is permitted. Developers build front-end components that know how to render each element. The CMS enforces the rules between them.

How RTF, HTML, and Structured Rich Text Compare

The differences matter when choosing how to store and deliver content. RTF stores formatting inline with minimal structure. HTML adds semantic markup. Structured rich text in a CMS separates content from presentation entirely, making it reusable across channels and governable at the field level.

Aspect	RTF (Rich Text Format)	HTML	Structured Rich Text (in CMS)
Primary purpose	Document interchange between word processors	Marking up documents for web browsers	Representing content plus structure and semantics inside a content model
Structure	Low: inline formatting, limited semantic structure	Medium: headings, lists, tables, sections	High: blocks, fields, and entities (paragraphs, callouts, references, embeds)
Semantic richness	Minimal: bold/italic/underline, fonts, colors	Moderate: heading levels, lists, links, semantic tags	High: explicit fields for titles, summaries, FAQs, entities, relationships, metadata
Reusability across channels	Low: designed for documents, not omnichannel reuse	Medium: repurposable but often coupled to page layout	High: content elements reused in web, apps, kiosks, AI agents, and feeds
Content vs presentation	Tightly coupled: formatting and content mixed	Mixed: structure plus presentational markup	Separated: content stored cleanly; presentation handled by templates or front-end layers
Governance	File-based, outside CMS governance	Governed at page or template level	Governed at content-type and field level — easier approvals and compliance checks
Fit for AI and GEO	Weak: ambiguous structure, hard to map to user intents	Adequate: parseable, but page-centric	Strongest: designed for chunking, retrieval, FAQ/Q&A, and entity-based experiences
Typical CMS use	Legacy WYSIWYG fields, pasted-in formatted text	Standard web pages and blogs	Modern content models with rich text fields plus structured subfields and references

Key Features of Rich Text Format

The RTF file format supports several formatting capabilities that made it useful for document interchange:

Text styling: Bold, italics, underlining, font sizes, font families, colors, paragraph alignment (left, center, right, justified), and indentation.
Structural elements: Bulleted and numbered lists, headings, paragraphs with different styles, and basic table support in most implementations.
Embedded content: Images (WMF, PNG, JPEG) and OLE objects, depending on RTF version and editor. Support for advanced interactivity or complex layouts is limited compared to DOCX or HTML.
Cross-platform compatibility: Most word processors can read and write RTF, making it a useful common denominator for document exchange. Fidelity can vary between RTF versions and editor implementations.

Benefits and Limitations of Rich Text Format

Where RTF works well: Portability is its strongest suit. RTF files open reliably across word processors and operating systems, making them a safe choice when the recipient’s software is unknown. The format is simpler than DOCX or PDF but richer than plain text—useful for simple reports, letters, and documentation. Because RTF uses plain-text control words under the hood, files are more inspectable than binary formats, which helps in migration and integration workflows. And because the format has been supported for decades, archived .rtf files remain openable long after the tools that created them have been retired.

Where RTF falls short: The specification has been frozen since 2008 (version 1.9.1). It lacks advanced layout, collaboration, commenting, and metadata features that DOCX, PDF, and modern HTML-based workflows provide. Rendering can be inconsistent across editors—the same .rtf file may look slightly different in Word, LibreOffice, and TextEdit. And critically for content operations, RTF has no concept of structured or semantic content. It’s a document format, not a content format.

For simple formatting and cross-platform document sharing, RTF still works. For anything involving collaboration, omnichannel delivery, governance, or structured content models, which is the territory of modern CMS platforms, RTF is the wrong tool.

How Does Rich Text Work in a Headless CMS?

The shift from RTF-as-document to rich-text-as-structured-data is the key conceptual move. In a headless CMS, rich text content is stored as structured data that an API delivers to any front end that requests it.

To work with rich text in a headless CMS, content teams typically need to:

Model rich text fields carefully. Use a WYSIWYG or block editor, but limit the available formatting to what every consuming channel actually needs. If your content goes to web, mobile, and email, don’t allow formatting that only works on the web.
Coordinate with front-end teams. Headings, lists, links, and embeds need to render consistently on every client. Front-end developers build rendering components for each block type; the CMS delivers the structured data.
Convert legacy content. If you have existing .rtf or HTML documents, convert them to the CMS’s structured format (HTML or JSON) as part of your content migration pipeline. Don’t paste raw RTF into a headless CMS—you’ll lose the benefits of structured storage.

The storage format matters more than most teams realize. A CMS that stores rich text as an HTML blob gives you formatting but couples content to presentation. A CMS that stores rich text as structured JSON—where each paragraph, heading, image, and embed is a discrete, typed node—gives you formatting and portability, governance, and clean API output.

What to Look for in a CMS Rich Text Editor

Not all rich text implementations are equal. When evaluating how a CMS handles rich text, these specifics matter:

Storage format. HTML, Markdown, or structured JSON? HTML is common but least portable. JSON-based storage (dotCMS’s Block Editor, Contentful’s Rich Text, Sanity’s Portable Text, Storyblok’s block system) provides cleaner content-presentation separation.
Editor experience. Does the editor feel familiar to content contributors? The best rich text editors let editors format content without producing unpredictable output.
Formatting controls. Can administrators restrict available formatting? In compliance-led organizations, limiting what editors can do—no custom fonts, no inline styles, only approved heading levels—is essential.
Embeddable content. Can editors insert structured content types (product cards, CTAs, code blocks) within the rich text flow, or are they limited to text and images?
API output. A structured JSON response is easier for front-end teams than an HTML blob that needs parsing and sanitization.

How dotCMS Handles Rich Text

dotCMS is a visual headless CMS built for compliance-led organizations that gives content teams three field types for rich text, each designed for a different use case:

Block Editor: Built on Tiptap, the Block Editor stores content as structured JSON. Each paragraph, heading, image, and embed is a discrete block that editors can drag, drop, and reorder. For front-end teams, this means clean, predictable API output that maps directly to rendering components. The Block Editor supports embedded content types (contentlets), AI-powered content generation via dotAI, and configurable formatting restrictions—administrators control exactly which block types editors can use.

WYSIWYG Editor: Stores content as HTML. Familiar to editors who prefer a word-processor-style experience. Works well for use cases where HTML output is acceptable, and content will primarily render on the web.
TextArea: Stores plain text. Useful for fields that shouldn’t carry formatting: slugs, meta descriptions, and short labels.

The Block Editor is where dotCMS’s approach diverges from most CMS platforms. Because content is stored as structured JSON rather than HTML, it’s natively suited for omnichannel delivery—the same content can render on a website, a mobile app, a kiosk, or feed an AI agent without transformation. Combined with the Universal Visual Editor, editors get a true visual editing experience on top of headless architecture—they see how content looks on the actual front end while editing. That’s what “headless without the drawbacks” means in practice.

Governance is built into the workflow. Administrators can restrict available block types per content type, enforce multi-step approval workflows before content publishes, and maintain a full audit trail—version history, permissions, and compliance checks at every step of the content lifecycle.

Get Started with Rich Text in dotCMS

Rich Text Format served its purpose as a document interchange standard for two decades. But for modern content operations—omnichannel delivery, structured content models, governance, compliance—the concept of rich text has moved far beyond what .rtf was built to do. dotCMS’s Block Editor and WYSIWYG fields give content teams the rich text capabilities they need with the structured storage, visual editing, and compliance workflows that enterprise organizations require. Request a demo to see how it works.

Rich Text Format Frequently Asked Questions

Is RTF the same as rich text in a CMS?

No. RTF (Rich Text Format) is a specific file type (.rtf) Microsoft created in 1987 for exchanging formatted documents between word processors. “Rich text” in a CMS refers to styled, structured content stored as HTML or JSON—not the .rtf file format. The terms share a name but serve different purposes.

What is the difference between rich text and plain text?

Plain text contains only raw characters with no formatting. Rich text adds styling (bold, italics, headings), structure (lists, tables), and embedded content (images, links). In a CMS, plain text fields handle metadata and labels; rich text fields hold the formatted content that users read.

How is rich text stored in a headless CMS?

Most headless CMS platforms store rich text as either HTML or structured JSON. HTML storage is simpler but mixes content with presentation. JSON-based storage (used by dotCMS’s Block Editor, Contentful, Sanity, and others) represents each content element as a typed node, making it easier to render consistently across channels and front-end frameworks.

What is a rich text editor?

A rich text editor is the interface content creators use to compose and format content within a CMS. It typically provides toolbar controls for bold, headings, lists, and media embedding. In headless CMS platforms, rich text editors range from traditional WYSIWYG interfaces to block-based editors that produce structured JSON output.