Skip to content

36002 workflow fire convert block editor markdown to prosemirror json on save server side#36253

Open
hassandotcms wants to merge 6 commits into
mainfrom
36002-workflow-fire-convert-block-editor-htmlmarkdown-to-prosemirror-json-on-save-server-side
Open

36002 workflow fire convert block editor markdown to prosemirror json on save server side#36253
hassandotcms wants to merge 6 commits into
mainfrom
36002-workflow-fire-convert-block-editor-htmlmarkdown-to-prosemirror-json-on-save-server-side

Conversation

@hassandotcms

Copy link
Copy Markdown
Member

What

Converts Story Block (Block Editor) field values supplied as Markdown to Tiptap/ProseMirror JSON server-side, on the shared content save path. Non-interactive clients (AI agents, headless imports) no longer need a human to open and re-save the contentlet for the field to read back as structured content.

Closes #36002 (Markdown scope; see Scope below).

How

  • TiptapMarkdown.isTiptapDoc / isMarkdownRepresentable — pure discriminators.
  • MapToContentletPopulator.fillFields — the seam shared by the workflow fire endpoints and the content REST API. For a Story Block value:
    • starts with { (JSON) or < (HTML) → stored unchanged;
    • otherwise (Markdown) → converted via TiptapMarkdown.toTiptap.
  • Guards: already-valid JSON passes through untouched; a Markdown update is rejected (400) when the existing stored doc contains rich blocks Markdown can't represent (dotContent, dotVideo, grid, …) instead of silently destroying them; a conversion failure never blocks the save (stores raw + logs).
  • OpenAPI fire note corrected to reflect automatic server-side conversion (regenerated).

Scope

  • Markdown only. HTML is deferred to a follow-up PR. HTML continues to pass through unchanged (no regression).

Behavior change

  • A Markdown overwrite of rich content now returns 400 instead of a silent success that
    corrupted the field. Additive and rollback-safe (stored JSON is read natively by N-1).

Testing

  • Unit (65): TiptapMarkdownDocDetectionTest (13) + existing TiptapMarkdownTest /
    RoundTripContractTest.
  • Integration (5): StoryBlockMarkdownPopulatorTest — convert + GraphQL read-back, JSON
    passthrough, HTML passthrough, primitive replace, rich-overwrite reject.
  • Regression re-run clean: MapToContentletPopulatorTest (20), StoryBlockValidationTest (28).

…minators

Add two pure helpers to TiptapMarkdown that the save path needs to safely
ingest Story Block values:

- isTiptapDoc(String): cheap detector for an already-valid Tiptap/ProseMirror
  document (peeks the first non-whitespace char before parsing), so editor-
  authored JSON can be stored unchanged instead of re-parsed as Markdown.
- isMarkdownRepresentable(String): true only when every block is Markdown-
  expressible, used to refuse a Markdown overwrite that would silently drop
  rich blocks (dotContent, dotVideo, grid, etc.). Marks are ignored on purpose
  (losing a mark loses styling, not content).

Covered by TiptapMarkdownDocDetectionTest (13 cases incl. nested rich blocks,
marks-only docs, malformed/empty/null input).

Refs #36002
…t save path

Wire the converter into MapToContentletPopulator.fillFields, the shared seam
that the workflow fire endpoints and the content REST API all funnel through.
For a Story Block field whose incoming value is Markdown (begins with neither
'{' nor '<'), convert it to a ProseMirror JSON document and store that, so non-
interactive clients (AI agents, headless imports) no longer require a human to
open and re-save the contentlet.

Guards:
- Already-valid Tiptap JSON and (deferred) HTML are stored unchanged.
- A Markdown update is refused when the existing stored document contains rich
  blocks Markdown cannot represent, rather than silently destroying them.
- A conversion failure never blocks the save: the raw value is stored and a
  warning logged (graceful degradation, consistent with #35728).

The converter stays pure; conversion and guards live at the ingestion seam.

Covered by StoryBlockMarkdownPopulatorTest (convert + GraphQL read-back, JSON
passthrough, HTML passthrough, primitive replace, rich-overwrite reject);
registered in MainSuite1b.

Refs #36002
…wn conversion

The fire endpoints' Block Editor note promised Markdown/HTML acceptance but
admitted it only took effect after a human re-saved in the editor — documenting
the exact bug #36002 fixes. Update the shared @operation note to state that
Markdown is converted to ProseMirror JSON automatically on save (and already-
valid JSON is stored unchanged), drop the "converted when opened in the editor"
caveat, and use a Markdown example. Regenerate openapi.yaml (all 6 fire
operations share the constant).

Refs #36002
@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @hassandotcms's task in 1m 53s —— View job


Rollback Safety Analysis

  • Read rollback-unsafe categories reference
  • Get full PR diff
  • Analyze diff against all unsafe categories
  • Applied label AI: Not Safe To Rollback and posted analysis comment

Result: 🟡 MEDIUM — Not Safe To Rollback

Category matched: M-3 — REST / GraphQL / Headless API Contract Change

Why: The save path for Story Block (Block Editor) fields now auto-converts Markdown input to ProseMirror JSON on write (MapToContentletPopulator.java lines 258–313). After N deploys, headless/AI-agent clients that submit Markdown and read back the field expect a structured ProseMirror JSON object. Rolling back to N-1 reverts to the old pass-through behavior, and those same clients receive a plain Markdown string — breaking their expectation. The Postman and Karate test updates confirm the response shape changes from a Markdown string to a ProseMirror JSON object.

No unsafe categories found in: database migrations, ES mapping changes, content JSON model version, DROP TABLE/COLUMN, runonce tasks, or any CRITICAL / HIGH category.

Safer alternative: The ProseMirror JSON written by N is natively readable by N-1, so no data-layer two-phase migration is needed. The recommendation is to document the Markdown auto-conversion as rollback-incompatible behavior in the release notes, advising consumers that this feature requires version N or above.

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — deepseek.v3.2

[🟠 High] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:261 — The condition value instanceof String may miss cases where the value is a non-String object that could be valid for a Story Block field (e.g., a JSON object). This could cause unexpected behavior or data loss if a client sends a structured JSON object instead of a JSON string.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:293 — The check trimmed.isEmpty() || trimmed.charAt(0) == '{' || trimmed.charAt(0) == '<' assumes HTML starts with '<', but a valid HTML string could have leading whitespace or be a different case (e.g., <!DOCTYPE>). This may cause HTML to be incorrectly stored as-is without conversion when it should be.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:298 — The method contentlet.getStringProperty(field.getVelocityVarName()) may return null if the property is not set, which could lead to a NullPointerException when passed to TiptapMarkdown.isTiptapDoc(existing). The method isTiptapDoc handles null, but the call !TiptapMarkdown.isMarkdownRepresentable(existing) could still throw if existing is null? Actually, isMarkdownRepresentable also handles null, so this is safe. However, the logic depends on the existing value being a Tiptap doc; if it's null, the condition TiptapMarkdown.isTiptapDoc(existing) will be false, skipping the rich content check. This might allow Markdown to overwrite a null field even if the field previously had rich content? No, because the existing value is null. This is acceptable, but the behavior should be clear.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:121 — The method isTiptapDoc uses value.stripLeading() which is available in Java 11+. dotCMS uses Java 25, so it's fine, but ensure compatibility with the project's minimum Java version if different.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:130 — The JSON parsing with MAPPER.readTree(value) could be expensive for large documents. Since this is called on every save for Story Block fields, consider performance impact. However, the early check trimmed.charAt(0) != '{' mitigates this for non-JSON values.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:152 — The set MARKDOWN_NODE_TYPES includes "dotImage" which is a custom block; ensure that dotImage is indeed representable in Markdown (likely as an image syntax). If not, this could incorrectly allow overwrites that lose information.

[🟠 High] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:168 — The method isMarkdownRepresentable catches IOException and returns true for malformed JSON. This means a malformed JSON string will be considered representable, potentially allowing a Markdown overwrite that should be rejected. This could lead to data loss. The behavior should be consistent with isTiptapDoc which returns false for malformed JSON.

[🟡 Medium] dotcms-integration/src/test/java/com/dotcms/rest/StoryBlockMarkdownPopulatorTest.java:113 — The test uses Mockito.mock but does not verify any interactions. This is fine for integration, but ensure mocks are properly configured to avoid false positives.

[🟡 Medium] dotcms-integration/src/test/java/com/dotcms/rest/StoryBlockMarkdownPopulatorTest.java:176 — The test expects an IllegalArgumentException with a message containing "rich content". If the exception message changes, the test could fail. Consider matching the exception type only or using a custom matcher.

[🟡 Medium] dotcms-postman/src/main/resources/postman/BringBack.postman_collection.json:115 — The Postman test updates the expectation for the body field from a plain string to a structured JSON object. This is correct for the new behavior, but ensure all related Postman collections are updated accordingly to avoid breaking CI/CD tests.

[🟡 Medium] dotcms-postman/src/main/resources/postman/JsScriptAPI.postman_collection.json:25 — Similar update for body field expectations. Ensure consistency across all API test suites.

[🟡 Medium] test-karate/src/test/java/tests/defaults/CheckingJSONAttributes.feature:71 — The Karate test now expects body to be a structured object. This is correct, but verify that all Karate scenarios relying on the old string format are updated.

[🟠 High] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:314 — The Logger.warn call logs the exception message e.getMessage() which could contain sensitive information or implementation details. Consider logging a generic message or using a sanitized exception.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/rest/api/v1/workflow/WorkflowResource.java:231 — The updated documentation states that Markdown is the only format converted, but the code also handles HTML (by storing as-is). The documentation should reflect that HTML is still accepted but not converted, to avoid confusion.

[🟡 Medium] dotCMS/src/main/webapp/WEB-INF/openapi/openapi.yaml:18675 — The OpenAPI specification updates the description for Block Editor fields. Ensure that the change is consistent across all endpoints (it appears to be updated in multiple places, which is good).


Run: #28038347219 · tokens: in: 11064 · out: 1323 · total: 12387

@hassandotcms hassandotcms requested review from fmontes and wezell June 22, 2026 14:39
@hassandotcms hassandotcms marked this pull request as ready for review June 22, 2026 14:40
#36002 normalizes a plain-text/Markdown Story Block value to a ProseMirror JSON
document on save, so a webPageContent `body` sent as plain text now reads back as
a structured doc (object), not the raw string. Update the two API tests that
asserted the old raw-string round-trip to assert the normalized doc instead,
keeping plain-text input so they still exercise the server-side conversion:

- Karate CheckingJSONAttributes.feature: assert body.type == 'doc' and the
  paragraph text, instead of body == "<raw string>".
- Postman JsScriptAPI: assert body.type == 'doc' and that the surviving text
  segments are present (inline <b> markup is dropped by the Markdown converter),
  across fireNew/fireEdit/firePublish via the JS workflows viewtool.

Refs #36002
…ntent checks

CI surfaced two more API tests that asserted a webPageContent `body` (a Story
Block field) read back as the raw plain-text string. #36002 now normalizes
plain-text to a ProseMirror doc on save, so the field comes back as structured
JSON. Update only the content (webPageContent) assertions to read the text from
the doc — body.content[0].content[0].text — leaving the template `body`
assertions (template markup, not a Story Block field) untouched:

- BringBack: 3 content version checks (create/edit/bring-back).
- VersionableResource: 1 content working-version check.

Determined the complete affected set by parsing every collection's body
assertions against its request endpoint (content vs template), so template,
GraphQL seeded/bundle, and response-body assertions are correctly excluded.

Refs #36002
…lmarkdown-to-prosemirror-json-on-save-server-side
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — qwen.qwen3-next-80b-a3b

New Issues

  • 🔴 Critical: dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:275processStoryBlockField calls toStoryBlockJson which may throw IllegalArgumentException if the existing content contains rich blocks, but this exception is not caught or logged. This will cause the entire contentlet save to fail with a 500 error, even though the spec requires "conversion failure never blocks the save". The contract explicitly permits Markdown conversion failure to be silent — but this path is now a hard failure.
  • 🔴 Critical: dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:127isTiptapDoc calls MAPPER.readTree(value) without validating that value is not null or empty after stripLeading(). If value is a string of only whitespace (e.g., " "), trimmed.isEmpty() returns true and returns false — correct. But if value is null, stripLeading() returns null, then trimmed.charAt(0) throws NullPointerException. This is reachable via processStoryBlockField when contentlet.getStringProperty(...) returns null (common for new contentlets). The method must guard against null before charAt(0).
  • 🔴 Critical: dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:159isMarkdownRepresentable(String) catches IOException and returns true, but isMarkdownRepresentable(JsonNode) does not handle null input. If MAPPER.readTree() returns null (e.g., malformed JSON), isMarkdownRepresentable(null) is called, which returns true — correct. But if node.path("type") returns a non-string (e.g., number), .asText("") is safe. However, node.path("content") may return a non-array (e.g., null, string, number), and isArray() returns false, so the loop is skipped — correct. But if node is null, node.path("type") returns a missing node, .asText("") returns "", and !type.isEmpty() is false — correct. However, the method assumes node is a valid JSON object — but if readTree() returns null due to malformed JSON, isMarkdownRepresentable(null) is called and returns true. This is acceptable per spec. But the real bug is in isTiptapDoc: it does not guard against null before charAt(0), and processStoryBlockField does not catch IllegalArgumentException — both are critical.
  • 🟠 High: dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:284Logger.warn(...) uses this as the logger category, but MapToContentletPopulator is not annotated with @Logger and does not declare a Logger field. This will cause Logger.warn(null, ...) to be called, which logs to null category — effectively silent. Must use Logger.getLogger(MapToContentletPopulator.class) or declare private static final Logger log = Logger.getLogger(MapToContentletPopulator.class);.
  • 🟠 High: dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:275toStoryBlockJson calls TiptapMarkdown.toTiptap(value) which may throw Exception — but the catch block logs the message and returns the original value. However, if TiptapMarkdown.toTiptap throws an unchecked exception (e.g., NullPointerException from malformed input), it is caught and logged — correct. But if TiptapMarkdown.isTiptapDoc(existing) throws IOException (e.g., malformed existing JSON), it returns false, so the guard is skipped — this is acceptable per spec. But the real issue is the uncaught IllegalArgumentException in processStoryBlockField and the NullPointerException in isTiptapDoc.

Existing

  • 🟡 Medium: dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:284Logger.warn(this, ...) uses this as logger category — this was already broken in pre-existing code and remains broken. The fix should have used a static logger, but it didn't.

Resolved

  • dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:118isTiptapDoc now guards against null and empty strings with stripLeading() and isEmpty() check — fixed the prior NullPointerException risk.
  • dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:159isMarkdownRepresentable(String) now catches IOException and returns true — fixed prior risk of unhandled parse failure.
  • dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:165isMarkdownRepresentable(JsonNode) now handles null input — fixed prior risk.
  • dotCMS/src/main/webapp/WEB-INF/openapi/openapi.yaml — updated to reflect Markdown-only guidance — consistent with new behavior.
  • dotcms-integration/src/test/java/com/dotcms/rest/StoryBlockMarkdownPopulatorTest.java — comprehensive integration tests added for all critical paths — closes test gap.
  • dotCMS/src/test/java/com/dotcms/tiptap/TiptapMarkdownDocDetectionTest.java — unit tests added for isTiptapDoc and isMarkdownRepresentable — closes test gap.

Run: #28113820145 · tokens: in: 11031 · out: 1484 · total: 12515

@claude

claude Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Pull Request Unsafe to Rollback!!!

  • Category: M-3 — REST / GraphQL / Headless API Contract Change
  • Risk Level: 🟡 MEDIUM
  • Why it's unsafe: The save path for Story Block (Block Editor) fields now auto-converts Markdown input to ProseMirror JSON on write. After N deploys, any headless or AI-agent client that submits Markdown and reads back the field expects a structured ProseMirror JSON object. Rolling back to N-1 reverts to the old pass-through behavior: Markdown is stored unchanged, and those clients receive a plain Markdown string instead of ProseMirror JSON — breaking their expectation. (Content already stored as ProseMirror JSON by N continues to read correctly on N-1, so the risk is narrowly limited to future Markdown submissions after rollback.)
  • Code that makes it unsafe: dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java lines 258-313 — the new processStoryBlockField / toStoryBlockJson methods that intercept Story Block values and convert Markdown to TiptapMarkdown.toTiptap(value).toString(). Test expectation changes in dotcms-postman/src/main/resources/postman/BringBack.postman_collection.json, JsScriptAPI.postman_collection.json, VersionableResource.postman_collection.json, and test-karate/src/test/java/tests/defaults/CheckingJSONAttributes.feature confirm the response shape changes from a Markdown string to a ProseMirror JSON object.
  • Alternative (if possible): The change is inherently additive (ProseMirror JSON written by N is natively readable by N-1), so a two-phase approach is not required for data safety. To protect newly-written headless clients from a rollback, document the Markdown auto-conversion as a rollback-incompatible behavior in the release notes, and advise consumers that this feature is only available on version N and above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Not Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Workflow fire: convert Block Editor HTML/Markdown to ProseMirror JSON on save (server-side)

1 participant