Site export
The process of extracting a website's content, structure, or data from a platform into a portable format that can be moved or archived elsewhere.
Also known as: website export, content export, data export
A site export is the process of extracting a website’s content, structure, or related data from the platform it lives on into a portable format that can be saved, archived, or moved to another platform. The export’s completeness, what does and does not come along, varies widely between platforms.
Site exports are central to migrations, archiving, backups, and reducing platform lock-in.
What can typically be exported
Different categories of website data export with varying levels of completeness:
| Data type | Typically exportable | Notes |
|---|---|---|
| Page text content | Usually yes | Often as HTML, Markdown, or plain text |
| Blog post content | Usually yes | Standard formats include WordPress XML, RSS |
| Images and media | Usually yes | Sometimes by URL reference, sometimes as a media archive |
| Page layouts | Often partially or not at all | Depends on whether layouts use proprietary blocks |
| Navigation and menus | Often platform-specific format | May need manual recreation |
| Forms and form submissions | Rarely transferable | Typically need rebuild on new platform |
| Member accounts and passwords | Rarely transferable | Passwords are hashed; user accounts may need re-registration |
| Ecommerce orders and customers | Sometimes via platform-specific export | Custom mappings often needed |
| Site search index | Not transferable | Rebuilt by the new platform |
| URL redirects | Sometimes exportable as a list | Need to be reapplied on the new platform |
| Theme / design files | Sometimes for self-hosted platforms; rarely for hosted | |
| Custom code and scripts | Sometimes | Embedded JavaScript may transfer; deeper customizations often do not |
Export by platform
| Platform | Export capabilities |
|---|---|
| WordPress (self-hosted) | Strong: WordPress eXtended RSS (WXR), full database SQL dumps, theme files, media library |
| WordPress.com | WXR XML export of content; theme not included |
| Squarespace | Limited XML export of blog posts and basic pages; layouts and design not included |
| Wix | Very limited; no full structured export of content |
| Webflow | HTML/CSS export of static design; CMS data export as CSV; backend functionality not included |
| Shopify | CSV export of products, customers, orders; theme files via theme editor |
| Ghost | JSON export of all content and members; native Markdown content |
| Notion (as CMS) | Markdown or HTML export per page or workspace |
| Static sites with Markdown content | Source files in Git are inherently portable |
The differences in export quality reflect different architectural approaches and business models.
Common export formats
| Format | Use |
|---|---|
| WordPress eXtended RSS (WXR) | XML-based; standard for WordPress content interchange |
| Markdown | Plain-text content with syntax for formatting; common in static sites and modern editors |
| HTML | Direct page markup; preserves visual structure but mixes content with presentation |
| JSON | Structured data; common in headless CMS and modern platforms |
| CSV | Tabular data; common for products, customers, redirects |
| SQL dump | Complete database export; comprehensive but platform-specific |
| ZIP archive | Bundle of files including HTML, CSS, JS, images |
What “good” exports include
A high-quality export typically:
- Includes all text content with formatting preserved
- References or includes media files
- Captures structured data fields (categories, tags, custom fields)
- Preserves URLs (or includes a URL map)
- Uses standard formats other tools can read
- Documents the schema clearly
A poor-quality export typically:
- Includes only basic text without structure
- Loses or breaks media references
- Drops custom fields and metadata
- Omits URL information
- Uses proprietary formats with no documentation
Limitations of exports
Even comprehensive exports usually do not include:
- Visual layouts built in proprietary editors. A drag-and-drop page exists only within its platform’s renderer
- Hosted features. Forms, search, member areas, ecommerce flows are wired into the platform infrastructure
- Real-time integrations. Connections to Mailchimp, Stripe, third-party services need to be reestablished
- Performance optimizations. Caching configurations, image optimization, CDN setup must be recreated
- Analytics history. Most analytics is tied to the platform; some tools (like Google Analytics) survive a migration if reinstalled
This gap between “exportable content” and “movable site” is what makes platform migration work-intensive.
Workflows around exports
Common ways exports are used:
- Backup. Regular exports stored offline or in version control, in case of platform issues
- Archiving. Preserving a site after it is decommissioned
- Migration to a new platform. Combined with a custom import process
- Format conversion. Converting WordPress to Markdown for a static site, for example
- Data analysis. Extracting content for review, audit, or reporting
Tools for exports and conversions
- Platform-native exporters built into the source platform’s admin
- Crawlers and scrapers (Screaming Frog, custom scripts) for sites without good native exports
- Format converters (Pandoc, html-to-markdown libraries, custom scripts)
- Migration plugins and services that combine export, transformation, and import
- Headless CMS migration tools for moving between Sanity, Contentful, Strapi, etc.
Common misconceptions
- “Export means everything comes along.” Most exports preserve content but lose layout, integrations, and platform-specific features.
- “Exports are deterministic.” The same source can produce different exports depending on the export option chosen, plugins installed, and platform version.
- “You only need exports when migrating.” Regular exports also serve as backups against platform outages, account issues, or unexpected changes.
- “All platforms export to standard formats.” Many use proprietary formats that require platform-specific tools to read.