Skip to content

Commit 1d2ebfc

Browse files
authored
Merge pull request #208 from j9t/fix/element-removal-and-duplicate-attrs
Fix empty element removal (when having `id`) and duplicate-attribute handling
2 parents f30af6e + d5fbaa7 commit 1d2ebfc

File tree

8 files changed

+118
-19
lines changed

8 files changed

+118
-19
lines changed

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,13 @@ As of version 2.0.0, all notable changes to HTML Minifier Next (HMN) are documen
44

55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
66

7+
## [5.0.3] - 2026-02-04
8+
9+
### Fixed
10+
11+
* Fixed empty elements with `id` attributes being incorrectly removed when `removeEmptyElements` was enabled (elements with IDs must be preserved as they serve as navigation targets, JavaScript selectors, and accessibility landmarks)
12+
* Fixed duplicate attributes being preserved in output (per HTML spec, when an attribute appears multiple times, [only the first occurrence is kept](https://html.spec.whatwg.org/multipage/parsing.html#attribute-name-state); duplicates are now removed during minification)
13+
714
## [5.0.2] - 2026-02-03
815

916
### Fixed

README.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -403,38 +403,38 @@ How does HTML Minifier Next compare to other minifiers? (All minification with t
403403
| Site | Original Size (KB) | [HTML Minifier Next](https://github.com/j9t/html-minifier-next) ([config](https://github.com/j9t/html-minifier-next/blob/main/benchmarks/html-minifier.json))<br>[![npm last update](https://img.shields.io/npm/last-update/html-minifier-next)](https://socket.dev/npm/package/html-minifier-next) | [htmlnano](https://github.com/posthtml/htmlnano)<br>[![npm last update](https://img.shields.io/npm/last-update/htmlnano)](https://socket.dev/npm/package/htmlnano) | [@swc/html](https://github.com/swc-project/swc)<br>[![npm last update](https://img.shields.io/npm/last-update/@swc/html)](https://socket.dev/npm/package/@swc/html) | [minify-html](https://github.com/wilsonzlin/minify-html)<br>[![npm last update](https://img.shields.io/npm/last-update/@minify-html/node)](https://socket.dev/npm/package/@minify-html/node) | [minimize](https://github.com/Swaagie/minimize)<br>[![npm last update](https://img.shields.io/npm/last-update/minimize)](https://socket.dev/npm/package/minimize) | [html­com­pressor.­com](https://htmlcompressor.com/) |
404404
| --- | --- | --- | --- | --- | --- | --- | --- |
405405
| [A List Apart](https://alistapart.com/) | 63 | **53** | 55 | 56 | 55 | 58 | 56 |
406-
| [Apple](https://www.apple.com/) | 236 | **197** | 209 | 212 | 213 | 215 | 215 |
407-
| [BBC](https://www.bbc.co.uk/) | 647 | **601** | 607 | 608 | 609 | 642 | n/a |
408-
| [CERN](https://home.cern/) | 150 | **82** | 90 | 90 | 90 | 92 | 95 |
406+
| [Apple](https://www.apple.com/) | 236 | **198** | 209 | 212 | 213 | 215 | 215 |
407+
| [BBC](https://www.bbc.co.uk/) | 651 | **605** | 611 | 612 | 613 | 646 | n/a |
408+
| [CERN](https://home.cern/) | 150 | **80** | 90 | 90 | 90 | 92 | 95 |
409409
| [CSS-Tricks](https://css-tricks.com/) | 155 | 127 | **121** | 136 | 137 | 141 | 138 |
410-
| [ECMAScript](https://tc39.es/ecma262/) | 7261 | **6411** | 6583 | 6465 | 6589 | 6637 | n/a |
410+
| [ECMAScript](https://tc39.es/ecma262/) | 7261 | **6447** | 6583 | 6465 | 6589 | 6637 | n/a |
411411
| [EDRi](https://edri.org/) | 80 | **68** | 69 | 69 | 71 | 74 | 72 |
412-
| [EFF](https://www.eff.org/) | 54 | **45** | 48 | 47 | 48 | 49 | 49 |
412+
| [EFF](https://www.eff.org/) | 54 | **45** | 49 | 47 | 48 | 49 | 49 |
413413
| [European Alternatives](https://european-alternatives.eu/) | 48 | **30** | 32 | 32 | 32 | 32 | 32 |
414-
| [FAZ](https://www.faz.net/aktuell/) | 1545 | 1412 | **1385** | 1471 | 1482 | 1492 | n/a |
414+
| [FAZ](https://www.faz.net/aktuell/) | 1519 | 1389 | **1364** | 1446 | 1457 | 1467 | n/a |
415415
| [French Tech](https://lafrenchtech.gouv.fr/) | 153 | **122** | 126 | 126 | 126 | 132 | 127 |
416-
| [Frontend Dogma](https://frontenddogma.com/) | 227 | **219** | 240 | 225 | 227 | 246 | 227 |
416+
| [Frontend Dogma](https://frontenddogma.com/) | 227 | **219** | 240 | 225 | 227 | 246 | 226 |
417417
| [Google](https://www.google.com/) | 18 | **16** | **16** | **16** | 17 | 18 | 18 |
418-
| [Ground News](https://ground.news/) | 1513 | **1371** | 1394 | 1418 | 1424 | 1499 | n/a |
418+
| [Ground News](https://ground.news/) | 1465 | **1328** | 1350 | 1372 | 1377 | 1453 | n/a |
419419
| [HTML Living Standard](https://html.spec.whatwg.org/multipage/) | 149 | 148 | 153 | **147** | 149 | 155 | 149 |
420420
| [Igalia](https://www.igalia.com/) | 49 | **34** | 36 | 36 | 36 | 37 | 37 |
421-
| [Leanpub](https://leanpub.com/) | 243 | **226** | 229 | 228 | 229 | 239 | 240 |
421+
| [Leanpub](https://leanpub.com/) | 241 | **224** | 226 | 226 | 227 | 236 | 238 |
422422
| [Mastodon](https://mastodon.social/explore) | 38 | 35 | **32** | 35 | 36 | 37 | 37 |
423-
| [MDN](https://developer.mozilla.org/en-US/) | 109 | **62** | 64 | 65 | 65 | 68 | 68 |
423+
| [MDN](https://developer.mozilla.org/en-US/) | 109 | **63** | 64 | 65 | 65 | 68 | 68 |
424424
| [Middle East Eye](https://www.middleeasteye.net/) | 220 | **194** | 200 | 198 | 198 | 199 | 200 |
425425
| [Mistral AI](https://mistral.ai/) | 343 | **307** | **307** | 310 | 311 | 340 | n/a |
426426
| [Mozilla](https://www.mozilla.org/) | 47 | **32** | 35 | 35 | 35 | 36 | 36 |
427427
| [Nielsen Norman Group](https://www.nngroup.com/) | 97 | 73 | **59** | 78 | 80 | 81 | 81 |
428-
| [SitePoint](https://www.sitepoint.com/) | 495 | 456 | **431** | 468 | 473 | 491 | n/a |
428+
| [SitePoint](https://www.sitepoint.com/) | 494 | 456 | **431** | 468 | 473 | 491 | n/a |
429429
| [Startup-Verband](https://startupverband.de/) | 43 | **30** | 31 | **30** | 31 | 31 | 31 |
430430
| [TetraLogical](https://tetralogical.com/) | 59 | 52 | **49** | 51 | 53 | 53 | 53 |
431431
| [TPGi](https://www.tpgi.com/) | 173 | **157** | 159 | 163 | 164 | 170 | 170 |
432432
| [United Nations](https://www.un.org/en/) | 151 | **113** | 121 | 125 | 124 | 130 | 123 |
433-
| [Vivaldi](https://vivaldi.com/) | 93 | **75** | n/a | 79 | 81 | 84 | 82 |
433+
| [Vivaldi](https://vivaldi.com/) | 93 | **76** | n/a | 79 | 81 | 84 | 82 |
434434
| [W3C](https://www.w3.org/) | 50 | **36** | 39 | 38 | 38 | 41 | 39 |
435-
| **Average processing time** | | 77 ms (30/30) | 149 ms (29/30) | 49 ms (30/30) | **17 ms (30/30)** | 330 ms (30/30) | 1513 ms (24/30) |
435+
| **Average processing time** | | 83 ms (30/30) | 165 ms (29/30) | 47 ms (30/30) | **13 ms (30/30)** | 283 ms (30/30) | 1254 ms (24/30) |
436436

437-
(Last updated: Feb 3, 2026)
437+
(Last updated: Feb 4, 2026)
438438
<!-- End auto-generated -->
439439

440440
Notes: Minimize does not minify CSS and JS. [HTML Minifier Terser](https://github.com/terser/html-minifier-terser) is currently not included due to issues around whitespace collapsing and removal of code using modern CSS features, issues which appeared to distort the data.

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,5 +94,5 @@
9494
},
9595
"type": "module",
9696
"types": "./dist/types/htmlminifier.d.ts",
97-
"version": "5.0.2"
97+
"version": "5.0.3"
9898
}

src/htmlminifier.js

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,8 @@ import {
3838
isExecutableScript,
3939
isStyleElement,
4040
normalizeAttr,
41-
buildAttr
41+
buildAttr,
42+
deduplicateAttributes
4243
} from './lib/attributes.js';
4344

4445
import {
@@ -1125,6 +1126,11 @@ async function minifyHTML(value, options, partialMarkup) {
11251126

11261127
buffer.push(openTag);
11271128

1129+
// Remove duplicate attributes (per HTML spec, first occurrence wins)
1130+
// Duplicate attributes result in invalid HTML
1131+
// https://html.spec.whatwg.org/multipage/parsing.html#attribute-name-state
1132+
deduplicateAttributes(attrs, options.caseSensitive);
1133+
11281134
if (options.sortAttributes) {
11291135
options.sortAttributes(tag, attrs);
11301136
}

src/lib/attributes.js

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,36 @@ function attributesInclude(attributes, attribute) {
6565
return false;
6666
}
6767

68+
/**
69+
* Remove duplicate attributes from an attribute list.
70+
* Per HTML spec, when an attribute appears multiple times, the first occurrence wins.
71+
* Duplicate attributes result in invalid HTML, so we keep only the first.
72+
* @param {Array} attrs - Array of attribute objects with `name` property
73+
* @param {boolean} caseSensitive - Whether to compare names case-sensitively (for XML/SVG)
74+
* @returns {Array} Deduplicated attribute array (modifies in place and returns)
75+
*/
76+
function deduplicateAttributes(attrs, caseSensitive) {
77+
if (attrs.length < 2) {
78+
return attrs;
79+
}
80+
81+
const seen = new Set();
82+
let writeIndex = 0;
83+
84+
for (let i = 0; i < attrs.length; i++) {
85+
const attr = attrs[i];
86+
const key = caseSensitive ? attr.name : attr.name.toLowerCase();
87+
88+
if (!seen.has(key)) {
89+
seen.add(key);
90+
attrs[writeIndex++] = attr;
91+
}
92+
}
93+
94+
attrs.length = writeIndex;
95+
return attrs;
96+
}
97+
6898
function isAttributeRedundant(tag, attrName, attrValue, attrs) {
6999
// Fast-path: Check if this element–attribute combination can possibly be redundant
70100
// before doing expensive string operations
@@ -604,5 +634,6 @@ export {
604634
// Cleaners
605635
cleanAttributeValue,
606636
normalizeAttr,
607-
buildAttr
637+
buildAttr,
638+
deduplicateAttributes
608639
};

src/lib/elements.js

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,16 @@ function canRemovePrecedingTag(optionalEndTag, tag) {
8080
// Element removal logic
8181

8282
function canRemoveElement(tag, attrs) {
83+
// Elements with `id` attribute must never be removed—they serve as:
84+
// - Navigation targets (skip links, URL fragments)
85+
// - JavaScript selector targets (`getElementById`, `querySelector`)
86+
// - CSS targets (`:target` pseudo-class, ID selectors)
87+
// - Accessibility landmarks (ARIA references)
88+
// - Portal mount points (React portals, etc.)
89+
if (hasAttrName('id', attrs)) {
90+
return false;
91+
}
92+
8393
switch (tag) {
8494
case 'textarea':
8595
return false;

tests/html.spec.js

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,32 @@ describe('HTML', () => {
180180
assert.strictEqual(await minify('<input title="bar" id="boo" value="hello world">'), '<input title="bar" id="boo" value="hello world">');
181181
});
182182

183+
test('Deduplicate attributes', async () => {
184+
// Per HTML spec, when duplicate attributes are present, the first occurrence wins
185+
// Duplicate attributes result in invalid HTML, so we deduplicate them
186+
187+
// Simple duplicate attribute
188+
assert.strictEqual(await minify('<html data-color-mode="auto" data-color-mode="light"></html>'), '<html data-color-mode="auto"></html>');
189+
190+
// Multiple duplicates of same attribute
191+
assert.strictEqual(await minify('<div class="a" id="x" class="b" class="c"></div>'), '<div class="a" id="x"></div>');
192+
193+
// Different attributes, some duplicated
194+
assert.strictEqual(await minify('<p title="first" class="a" title="second"></p>'), '<p title="first" class="a"></p>');
195+
196+
// Duplicate data attributes (common in GitHub-style HTML)
197+
assert.strictEqual(await minify('<html data-dark-theme="dark" data-dark-theme="dark_dimmed"></html>'), '<html data-dark-theme="dark"></html>');
198+
199+
// No duplicates—should pass through unchanged
200+
assert.strictEqual(await minify('<div id="x" class="y" title="z"></div>'), '<div id="x" class="y" title="z"></div>');
201+
202+
// Case-insensitive deduplication in HTML mode
203+
assert.strictEqual(await minify('<div Class="a" CLASS="b" class="c"></div>'), '<div class="a"></div>');
204+
205+
// Case-sensitive deduplication in SVG mode
206+
assert.strictEqual(await minify('<svg viewBox="0 0 100 100" viewBox="0 0 200 200"></svg>'), '<svg viewBox="0 0 100 100"></svg>');
207+
});
208+
183209
test('Space normalization around text', async () => {
184210
let input, output;
185211
input = ' <p>blah</p>\n\n\n ';
@@ -1699,6 +1725,25 @@ describe('HTML', () => {
16991725
assert.strictEqual(await minify(input, { removeEmptyElements: true }), input);
17001726
output = '<div>Empty<!-- NOT --></div>';
17011727
assert.strictEqual(await minify(input, { collapseWhitespace: true, removeEmptyElements: true }), output);
1728+
1729+
// Elements with `id` attribute must not be removed
1730+
input = '<div id="start-of-content"></div>';
1731+
assert.strictEqual(await minify(input, { removeEmptyElements: true }), input);
1732+
1733+
input = '<div id="__primerPortalRoot__"></div>';
1734+
assert.strictEqual(await minify(input, { removeEmptyElements: true }), input);
1735+
1736+
input = '<ul id="query-builder-results" data-action="combobox-commit"></ul>';
1737+
assert.strictEqual(await minify(input, { removeEmptyElements: true }), input);
1738+
1739+
// Element with `id` and other attributes
1740+
input = '<span id="target" class="highlight" aria-hidden="true"></span>';
1741+
assert.strictEqual(await minify(input, { removeEmptyElements: true }), input);
1742+
1743+
// Multiple elements, only those without `id` should be removed
1744+
input = '<div id="keep"></div><span></span><p id="also-keep"></p>';
1745+
output = '<div id="keep"></div><p id="also-keep"></p>';
1746+
assert.strictEqual(await minify(input, { removeEmptyElements: true }), output);
17021747
});
17031748

17041749
test('`removeEmptyElementsExcept`', async () => {

0 commit comments

Comments
 (0)