Improve text sanitization and fix comment truncation#14
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves WhatsApp message formatting by enhancing HTML-to-text sanitization (preserving readable line breaks) and fixes comment workflow truncation/signature handling to respect organization comment length limits.
Changes:
- Update
sanitize_textto translate common HTML block/line-break tags into newlines (and collapse excessive newlines). - Adjust comment creation/preview logic to truncate comments consistently and append the signature dynamically.
- Exclude rejected proposals from the proposals workflow list and add a required
typefield to WhatsApp carousel cards.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| spec/services/workflows/base_workflow_spec.rb | Adds specs for new HTML line-break preservation behavior in sanitize_text. |
| app/services/decidim/chatbot/workflows/base_workflow.rb | Implements HTML-aware newline handling in sanitize_text. |
| app/services/decidim/chatbot/workflows/comments_workflow.rb | Changes comment preview/storage to truncate by max length and append signature dynamically. |
| app/services/decidim/chatbot/workflows/proposals_workflow.rb | Filters out rejected proposals from the proposals scope. |
| app/services/decidim/chatbot/providers/whatsapp/envelopes/interactive_carousel.rb | Adds required card type: "cta_url" field for WhatsApp interactive carousel payloads. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| hard_max_length = 1000 - signature.length | ||
| return hard_max_length unless resource.respond_to?(:component) | ||
|
|
||
| resource.component.organization.comments_max_length if resource.component.organization.comments_max_length.positive? | ||
| return organization.comments_max_length if organization.comments_max_length.to_i.positive? | ||
|
|
||
| hard_max_length |
There was a problem hiding this comment.
comments_max_length has a couple of issues: (1) resource.component.organization.comments_max_length if ... doesn’t return, so the organization-specific limit is effectively ignored unless setting.organization happens to be the same; (2) it calls .positive? on comments_max_length without to_i, which can raise if the value is nil; (3) when an org limit is present it doesn’t subtract signature.length, so the stored comment can exceed the org max once the signature is appended. Consider computing a single max_total = [hard_limit, org_limit].min and returning max_total - signature.length (or similar), with nil-safe access to component/org.
| hard_max_length = 1000 - signature.length | |
| return hard_max_length unless resource.respond_to?(:component) | |
| resource.component.organization.comments_max_length if resource.component.organization.comments_max_length.positive? | |
| return organization.comments_max_length if organization.comments_max_length.to_i.positive? | |
| hard_max_length | |
| # total hard limit, including the signature | |
| hard_total = 1000 | |
| signature_length = signature.length | |
| # default to the hard limit if we cannot determine a component/organization | |
| return hard_total - signature_length unless resource.respond_to?(:component) | |
| org = resource.component.respond_to?(:organization) ? resource.component.organization : nil | |
| org_limit_total = org&.comments_max_length.to_i | |
| org_limit_total = nil unless org_limit_total.positive? | |
| max_total = org_limit_total ? [hard_total, org_limit_total].min : hard_total | |
| # return the maximum number of characters allowed for the comment text itself | |
| [max_total - signature_length, 0].max |
|
|
||
| def proposals | ||
| @proposals ||= Decidim::Proposals::Proposal.where(component:).published.only_amendables | ||
| @proposals ||= Decidim::Proposals::Proposal.where(component:).published.except_rejected.only_amendables |
There was a problem hiding this comment.
The new behavior to exclude rejected proposals (.except_rejected) isn’t covered by existing ProposalsWorkflow specs. Please add a spec that creates a rejected proposal and asserts it does not appear in the carousel/list (and that counts/pagination reflect the filtered scope).
| def cards | ||
| data[:cards].map.with_index do |card, index| | ||
| { | ||
| type: "cta_url", # This is necessary, even if we're not using the URL feature |
There was a problem hiding this comment.
InteractiveCarousel#cards now adds type: "cta_url" on each card to satisfy the WhatsApp API requirement, but the envelope specs don’t assert this field. Add a focused assertion in spec/services/providers/whatsapp/envelopes_spec.rb so this requirement doesn’t regress silently.
| def sanitize_text(text, truncate = 1024) | ||
| strip_tags(translated_attribute(text)).truncate(truncate) | ||
| translated = translated_attribute(text) | ||
| # Convert HTML line breaks and block elements to newlines before stripping tags | ||
| sanitized = translated.to_s | ||
| .gsub(%r{<br\s*/?>}i, "\n") # <br>, <br/>, <br /> -> newline | ||
| .gsub(%r{<h[1-6][^>]*>(.*?)</h[1-6]>}im, "*\\1*\n\n") # <h*>title</h*> -> *title* with newlines | ||
| .gsub(%r{</p>}i, "\n\n") # </p> -> double newline for paragraphs | ||
| .gsub(/<p[^>]*>/i, "") # Remove opening <p> tags | ||
| .gsub(%r{</div>}i, "\n") # </div> -> newline | ||
| .gsub(/<div[^>]*>/i, "") # Remove opening <div> tags | ||
| strip_tags(sanitized).strip.gsub(/\n{3,}/, "\n\n").truncate(truncate) |
There was a problem hiding this comment.
sanitize_text still defaults to truncating at 1024 chars, but the updated spec expects 4000 chars by default. This mismatch will fail tests and likely truncates longer WhatsApp texts unexpectedly; align the default (or adjust the spec) so both behavior and tests agree.
|
|
||
| def send_comment_confirmation | ||
| body = sanitize_text("*#{I18n.t("decidim.chatbot.workflows.comments.comment_received")}*\n\n#{received_message.body}") | ||
| body = sanitize_text("*#{I18n.t("decidim.chatbot.workflows.comments.comment_received")}*\n\n#{received_message.body.truncate(comments_max_length)}") |
There was a problem hiding this comment.
send_comment_confirmation truncates the user text with comments_max_length, but then calls sanitize_text without passing a truncate length, so the preview will be truncated again using sanitize_text's default. To ensure the confirmation preview respects the same limit, pass an explicit truncate value to sanitize_text (and make sure it matches the stored comment’s limit calculation).
| body = sanitize_text("*#{I18n.t("decidim.chatbot.workflows.comments.comment_received")}*\n\n#{received_message.body.truncate(comments_max_length)}") | |
| body = sanitize_text("*#{I18n.t("decidim.chatbot.workflows.comments.comment_received")}*\n\n#{received_message.body.truncate(comments_max_length)}", comments_max_length) |
| def comment_body | ||
| sender.current_workflow_options["comment"] | ||
| original = sender.current_workflow_options["comment"].to_s.truncate(comments_max_length) | ||
| "#{original}#{signature}" | ||
| end |
There was a problem hiding this comment.
comment_body truncates original to comments_max_length and then appends signature, which can produce a final comment longer than the intended max length (especially when comments_max_length comes from the organization setting). Truncate the user-provided portion to (max_length - signature.length) (or have comments_max_length return a value that already accounts for the signature) so the final stored body never exceeds the limit.
Changes
Text Sanitization Improvements
Comment Workflow Fixes
Other Fixes
Tests