Skip to content

Commit cf87707

Browse files
committed
feat: add string/base/distances/hamming-code-points
--- type: pre_commit_static_analysis_report description: Results of running static analysis checks when committing changes. report: - task: lint_filenames status: passed - task: lint_editorconfig status: passed - task: lint_markdown status: passed - task: lint_package_json status: passed - task: lint_repl_help status: passed - task: lint_javascript_src status: passed - task: lint_javascript_cli status: na - task: lint_javascript_examples status: passed - task: lint_javascript_tests status: passed - task: lint_javascript_benchmarks status: passed - task: lint_python status: na - task: lint_r status: na - task: lint_c_src status: na - task: lint_c_examples status: na - task: lint_c_benchmarks status: na - task: lint_c_tests_fixtures status: na - task: lint_shell status: na - task: lint_typescript_declarations status: passed - task: lint_typescript_tests status: passed - task: lint_license_headers status: passed ---
1 parent a1f8442 commit cf87707

File tree

10 files changed

+789
-0
lines changed

10 files changed

+789
-0
lines changed
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
<!--
2+
3+
@license Apache-2.0
4+
5+
Copyright (c) 2026 The Stdlib Authors.
6+
7+
Licensed under the Apache License, Version 2.0 (the "License");
8+
you may not use this file except in compliance with the License.
9+
You may obtain a copy of the License at
10+
11+
http://www.apache.org/licenses/LICENSE-2.0
12+
13+
Unless required by applicable law or agreed to in writing, software
14+
distributed under the License is distributed on an "AS IS" BASIS,
15+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
See the License for the specific language governing permissions and
17+
limitations under the License.
18+
19+
-->
20+
21+
# hammingDistanceCodePoints
22+
23+
> Calculate the [Hamming distance][hamming-distance] between two equal-length strings by comparing Unicode code points.
24+
25+
<!-- Package usage documentation. -->
26+
27+
<section class="usage">
28+
29+
## Usage
30+
31+
```javascript
32+
var hammingDistanceCodePoints = require( '@stdlib/string/base/distances/hamming-code-points' );
33+
```
34+
35+
#### hammingDistanceCodePoints( s1, s2 )
36+
37+
Calculates the [Hamming distance][hamming-distance] between two equal-length strings by comparing Unicode code points.
38+
39+
```javascript
40+
var dist = hammingDistanceCodePoints( 'frog', 'from' );
41+
// returns 1
42+
43+
dist = hammingDistanceCodePoints( 'tooth', 'froth' );
44+
// returns 2
45+
46+
dist = hammingDistanceCodePoints( 'cat', 'cot' );
47+
// returns 1
48+
49+
dist = hammingDistanceCodePoints( '', '' );
50+
// returns 0
51+
52+
// Emoji are treated as single Unicode code points:
53+
dist = hammingDistanceCodePoints( '👋', '🌍' );
54+
// returns 1
55+
56+
dist = hammingDistanceCodePoints( 'a👋b', 'c🌍d' );
57+
// returns 3
58+
```
59+
60+
</section>
61+
62+
<!-- /.usage -->
63+
64+
<!-- Package notes. Make sure to keep an empty line after the `section` element and another before the `/section` close. -->
65+
66+
<section class="notes">
67+
68+
## Notes
69+
70+
- If the two strings differ in the number of Unicode code points, the [Hamming distance][hamming-distance] is not defined. Consequently, when provided two input strings with an unequal number of Unicode code points, the function returns a sentinel value of `-1`.
71+
- Unlike the UTF-16 code unit implementation in `@stdlib/string/base/distances/hamming`, this function iterates over **Unicode code points** rather than UTF-16 code units. This means surrogate pairs (used to encode characters outside the Basic Multilingual Plane, such as most emoji) are treated as a single unit of comparison. For example, the emoji `'👋'` (U+1F44B) is encoded as a UTF-16 surrogate pair `\uD83D\uDC4B` and has a `String.length` of `2`, but this function treats it as a single code point.
72+
- The function is **not** grapheme-cluster aware. Characters composed of multiple Unicode code points (e.g., family emoji built from multiple code points joined by Zero Width Joiners, or letters with combining diacritical marks) are treated as multiple code points.
73+
74+
</section>
75+
76+
<!-- /.notes -->
77+
78+
<!-- Package usage examples. -->
79+
80+
<section class="examples">
81+
82+
## Examples
83+
84+
```javascript
85+
var hammingDistanceCodePoints = require( '@stdlib/string/base/distances/hamming-code-points' );
86+
87+
var dist = hammingDistanceCodePoints( 'algorithms', 'altruistic' );
88+
// returns 7
89+
90+
dist = hammingDistanceCodePoints( 'elephant', 'hippopod' );
91+
// returns 7
92+
93+
dist = hammingDistanceCodePoints( 'javascript', 'typescript' );
94+
// returns 4
95+
96+
dist = hammingDistanceCodePoints( 'hamming', 'ladybug' );
97+
// returns 5
98+
99+
// Emoji strings (each emoji = 1 Unicode code point):
100+
dist = hammingDistanceCodePoints( '👋🌍🎉', '🌟💫✨' );
101+
// returns 3
102+
103+
// Mixed ASCII and emoji:
104+
dist = hammingDistanceCodePoints( 'hello👋', 'hallo🌍' );
105+
// returns 2
106+
```
107+
108+
</section>
109+
110+
<!-- /.examples -->
111+
112+
<!-- Section for related `stdlib` packages. Do not manually edit this section, as it is automatically populated. -->
113+
114+
<section class="related">
115+
116+
</section>
117+
118+
<!-- /.related -->
119+
120+
<!-- Section for all links. Make sure to keep an empty line after the `section` element and another before the `/section` close. -->
121+
122+
<section class="links">
123+
124+
[hamming-distance]: https://en.wikipedia.org/wiki/Hamming_distance
125+
126+
</section>
127+
128+
<!-- /.links -->
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
/**
2+
* @license Apache-2.0
3+
*
4+
* Copyright (c) 2026 The Stdlib Authors.
5+
*
6+
* Licensed under the Apache License, Version 2.0 (the "License");
7+
* you may not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
'use strict';
20+
21+
// MODULES //
22+
23+
var bench = require( '@stdlib/bench' );
24+
var pkg = require( './../package.json' ).name;
25+
var hammingDistanceCodePoints = require( './../lib' );
26+
27+
28+
// MAIN //
29+
30+
bench( pkg, function benchmark( b ) {
31+
var values;
32+
var value;
33+
var out;
34+
var i;
35+
36+
values = [
37+
[ 'algorithms', 'altruistic' ],
38+
[ '1638452297', '4444884447' ],
39+
[ '', '' ],
40+
[ 'z', 'a' ],
41+
[ 'aaappppk', 'aardvark' ],
42+
[ 'frog', 'flog' ],
43+
[ 'fly', 'ant' ],
44+
[ 'elephant', 'hippopod' ],
45+
[ 'hippopod', 'elephant' ],
46+
[ 'hippo', 'zzzzz' ],
47+
[ 'hello', 'hallo' ],
48+
[ '👋🌍🎉', '🌟💫✨' ],
49+
[ 'a👋b', 'c🌍d' ],
50+
[ 'congratulations', 'conmgeautlatins' ]
51+
];
52+
53+
b.tic();
54+
for ( i = 0; i < b.iterations; i++ ) {
55+
value = values[ i%values.length ];
56+
out = hammingDistanceCodePoints( value[0], value[1] );
57+
if ( typeof out !== 'number' ) {
58+
b.fail( 'should return a number' );
59+
}
60+
}
61+
b.toc();
62+
if ( typeof out !== 'number' ) {
63+
b.fail( 'should return a number' );
64+
}
65+
b.pass( 'benchmark finished' );
66+
b.end();
67+
});
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
2+
{{alias}}( s1, s2 )
3+
Calculates the Hamming distance between two equal-length strings by
4+
comparing Unicode code points.
5+
6+
The function returns a sentinel value of -1 if the two input strings differ
7+
in the number of Unicode code points.
8+
9+
Parameters
10+
----------
11+
s1: string
12+
First input string.
13+
14+
s2: string
15+
Second input string.
16+
17+
Returns
18+
-------
19+
out: number
20+
Hamming distance.
21+
22+
Examples
23+
--------
24+
> var d = {{alias}}( 'algorithms', 'altruistic' )
25+
7
26+
> d = {{alias}}( 'elephant', 'hippopod' )
27+
7
28+
> d = {{alias}}( 'javascript', 'typescript' )
29+
4
30+
> d = {{alias}}( '👋', '🌍' )
31+
1
32+
> d = {{alias}}( 'a👋', 'b🌍' )
33+
2
34+
35+
See Also
36+
--------
37+
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
/*
2+
* @license Apache-2.0
3+
*
4+
* Copyright (c) 2026 The Stdlib Authors.
5+
*
6+
* Licensed under the Apache License, Version 2.0 (the "License");
7+
* you may not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
// TypeScript Version: 4.1
20+
21+
/**
22+
* Calculates the Hamming distance between two equal-length strings by comparing Unicode code points.
23+
*
24+
* ## Notes
25+
*
26+
* - The function returns a sentinel value of `-1` if the two input strings differ in the number of Unicode code points.
27+
*
28+
* @param str1 - first input string
29+
* @param str2 - second input string
30+
* @returns Hamming distance
31+
*
32+
* @example
33+
* var dist = hammingDistanceCodePoints( 'fly', 'ant' );
34+
* // returns 3
35+
*
36+
* @example
37+
* var dist = hammingDistanceCodePoints( '👋', '🌍' );
38+
* // returns 1
39+
*
40+
* @example
41+
* var dist = hammingDistanceCodePoints( 'algorithms', 'altruistic' );
42+
* // returns 7
43+
*/
44+
declare function hammingDistanceCodePoints( str1: string, str2: string ): number;
45+
46+
47+
// EXPORTS //
48+
49+
export = hammingDistanceCodePoints;
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
/*
2+
* @license Apache-2.0
3+
*
4+
* Copyright (c) 2026 The Stdlib Authors.
5+
*
6+
* Licensed under the Apache License, Version 2.0 (the "License");
7+
* you may not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
import hammingDistanceCodePoints = require( './index' );
20+
21+
22+
// TESTS //
23+
24+
// The function returns a number...
25+
{
26+
hammingDistanceCodePoints( '', '' ); // $ExpectType number
27+
hammingDistanceCodePoints( 'fly', 'ant' ); // $ExpectType number
28+
hammingDistanceCodePoints( '👋', '🌍' ); // $ExpectType number
29+
}
30+
31+
// The compiler throws an error if the function is provided a first argument which is not a string...
32+
{
33+
hammingDistanceCodePoints( true, '' ); // $ExpectError
34+
hammingDistanceCodePoints( false, '' ); // $ExpectError
35+
hammingDistanceCodePoints( null, '' ); // $ExpectError
36+
hammingDistanceCodePoints( undefined, '' ); // $ExpectError
37+
hammingDistanceCodePoints( 5, '' ); // $ExpectError
38+
hammingDistanceCodePoints( [], '' ); // $ExpectError
39+
hammingDistanceCodePoints( {}, '' ); // $ExpectError
40+
hammingDistanceCodePoints( ( x: number ): number => x, '' ); // $ExpectError
41+
}
42+
43+
// The compiler throws an error if the function is provided a second argument which is not a string...
44+
{
45+
hammingDistanceCodePoints( '', true ); // $ExpectError
46+
hammingDistanceCodePoints( '', false ); // $ExpectError
47+
hammingDistanceCodePoints( '', null ); // $ExpectError
48+
hammingDistanceCodePoints( '', undefined ); // $ExpectError
49+
hammingDistanceCodePoints( '', 5 ); // $ExpectError
50+
hammingDistanceCodePoints( '', [] ); // $ExpectError
51+
hammingDistanceCodePoints( '', {} ); // $ExpectError
52+
hammingDistanceCodePoints( '', ( x: number ): number => x ); // $ExpectError
53+
}
54+
55+
// The compiler throws an error if the function is provided an unsupported number of arguments...
56+
{
57+
hammingDistanceCodePoints(); // $ExpectError
58+
hammingDistanceCodePoints( '' ); // $ExpectError
59+
hammingDistanceCodePoints( '', '', 3 ); // $ExpectError
60+
}
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
/**
2+
* @license Apache-2.0
3+
*
4+
* Copyright (c) 2026 The Stdlib Authors.
5+
*
6+
* Licensed under the Apache License, Version 2.0 (the "License");
7+
* you may not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
'use strict';
20+
21+
var hammingDistanceCodePoints = require( './../lib' );
22+
23+
console.log( hammingDistanceCodePoints( 'algorithms', 'altruistic' ) );
24+
// => 7
25+
26+
console.log( hammingDistanceCodePoints( 'elephant', 'hippopod' ) );
27+
// => 7
28+
29+
console.log( hammingDistanceCodePoints( 'javascript', 'typescript' ) );
30+
// => 4
31+
32+
// All emoji strings:
33+
console.log( hammingDistanceCodePoints( '👋🌍🎉', '🌟💫✨' ) );
34+
// => 3
35+
36+
// Mixed ASCII and emoji strings:
37+
console.log( hammingDistanceCodePoints( 'a👋b', 'c🌍d' ) );
38+
// => 3
39+
40+
// Unequal code-point lengths return -1:
41+
console.log( hammingDistanceCodePoints( 'a', 'abcissa' ) );
42+
// => -1

0 commit comments

Comments
 (0)