Skip to content

Commit becad26

Browse files
authored
fraud protection: Only count unverified otp in countries per ip bucket #5682
fraud protection: Only count unverified otp in countries per ip bucket
2 parents d8a5ae4 + e461243 commit becad26

8 files changed

Lines changed: 485 additions & 12 deletions

docs/plans/fraud-protection-implementation.md renamed to docs/plans/fraud-protection/01-fraud-protection-implementation.md

File renamed without changes.
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# Fraud Protection: Exclude Verified Countries from IP-Country Warning
2+
3+
## Summary
4+
5+
Adjust `SMS__PHONE_COUNTRIES__BY_IP__DAILY_THRESHOLD_EXCEEDED` so it counts only countries that have **no verified SMS OTP in the same 24h window**.
6+
7+
If an IP has at least one verified OTP for a country during that window, that country is excluded from the distinct-country count. The threshold remains `3`.
8+
9+
This is a behavior change only. No backward-compatibility work is required.
10+
11+
## Runtime Change
12+
13+
### Current behavior
14+
15+
`pkg/lib/fraudprotection/leaky_bucket_store.go` currently tracks distinct phone countries per IP in the Redis ZSET keyed by:
16+
17+
`app:{appID}:fraud_protection:ip_countries:{ip}`
18+
19+
`RecordSMSOTPSent(...)` updates that ZSET and triggers `IPCountriesDaily` when the raw distinct-country count exceeds the fixed threshold.
20+
21+
### New behavior
22+
23+
Add a second Redis ZSET per IP to record countries that have at least one verified SMS OTP in the same 24h window:
24+
25+
`app:{appID}:fraud_protection:ip_verified_countries:{ip}`
26+
27+
The send-path warning becomes:
28+
29+
1. Count distinct countries seen from the IP in the last 24h.
30+
2. Remove any country that appears in the verified-country ZSET for that IP.
31+
3. Trigger `SMS__PHONE_COUNTRIES__BY_IP__DAILY_THRESHOLD_EXCEEDED` only when the filtered count is `> 3`.
32+
33+
### Interface Details
34+
35+
#### `pkg/lib/fraudprotection/service.go`
36+
37+
`LeakyBucketer` becomes:
38+
39+
```go
40+
type LeakyBucketer interface {
41+
RecordSMSOTPSent(ctx context.Context, ip, phoneCountry string, thresholds LeakyBucketThresholds) (LeakyBucketTriggered, LeakyBucketLevels, error)
42+
RecordSMSOTPVerified(ctx context.Context, ip, phoneCountry string, thresholds LeakyBucketThresholds, count int) error
43+
RecordSMSOTPVerifiedCountry(ctx context.Context, ip, phoneCountry string) error
44+
}
45+
```
46+
47+
`Service.RecordSMSOTPVerified(ctx, phoneNumber)` keeps the current parse/write/drain flow, but now invokes the bucket store in this order:
48+
49+
1. read `ip` from the request context and parse `phoneNumber` to get `phoneCountry`
50+
2. `Metrics.RecordVerified(ctx, ip, phoneCountry)`
51+
3. `LeakyBucket.RecordSMSOTPVerifiedCountry(ctx, ip, phoneCountry)`
52+
4. `RevertSMSOTPSent(ctx, phoneNumber, 1)`
53+
54+
The verified-country update is a side effect of a successful SMS OTP consumption only. It is not part of the alt-auth revert path.
55+
56+
This split matters because the lower-level `LeakyBucketStore.RecordSMSOTPVerified(...)` method is also used by `RevertSMSOTPSent(...)` to drain unverified OTPs that were sent during a flow but never consumed. If the same method also marked a country as verified, alt-auth cleanup would incorrectly promote unverified sends into the verified-country set.
57+
58+
So the invariant is:
59+
60+
- service-level `RecordSMSOTPVerified(...)` = actual verification event
61+
- store-level `RecordSMSOTPVerified(...)` = drain-only bookkeeping for verified and reverted counts
62+
- store-level `RecordSMSOTPVerifiedCountry(...)` = explicit marker for a real verified OTP
63+
64+
#### `pkg/lib/fraudprotection/leaky_bucket_store.go`
65+
66+
`leakyBucketScript` remains unchanged and continues to be used only for the four leaky buckets on the send path.
67+
68+
`ipCountriesScript` remains the script that computes the IP-country warning on the send path. That is where the filtered distinct-country count is evaluated.
69+
70+
The verified-country marker is implemented by `RecordSMSOTPVerifiedCountry(ctx, ip, phoneCountry)`, which writes to the IP-scoped verified-country ZSET. It can reuse the same 24h retention model as the send-path country ZSET, but it is intentionally a separate store method so the service can call it only for real OTP consumption, not for alt-auth cleanup.
71+
72+
`RecordSMSOTPVerified(ctx, ip, phoneCountry, thresholds, count)` remains drain-only and is still used by `RevertSMSOTPSent(...)` for unverified OTP cleanup.
73+
74+
The filtered IP-country count is computed in the send-path script, not in Go:
75+
76+
```go
77+
res, err := conn.Eval(ctx, ipCountriesScript,
78+
[]string{s.ipCountriesKey(ip), s.ipVerifiedCountriesKey(ip)},
79+
phoneCountry, now, ipCountriesThreshold, 2*bucketWindowDaily,
80+
).Slice()
81+
```
82+
83+
`ipCountriesScript` is responsible for:
84+
85+
1. `ZADD` the sent country into `ip_countries`
86+
2. prune expired entries from both `ip_countries` and `ip_verified_countries`
87+
3. `ZRANGE` both ZSETs and build a Lua lookup table for the verified countries
88+
4. count the distinct sent countries that are not present in the verified-country lookup table
89+
5. return `{filtered_count, triggered_int}`
90+
91+
This keeps the send-path warning atomic with the country update and avoids a race between sent-country recording and verified-country exclusion.
92+
93+
## File-Level Changes
94+
95+
### `pkg/lib/fraudprotection/leaky_bucket_store.go`
96+
97+
- Keep the existing `ip_countries` ZSET and the four leaky buckets unchanged.
98+
- Add a verified-country ZSET helper named `ipVerifiedCountriesKey(ip string) string`.
99+
- Add a new store method `RecordSMSOTPVerifiedCountry(ctx context.Context, ip, phoneCountry string) error`.
100+
- Keep `RecordSMSOTPVerified(...)` drain-only.
101+
- Update `RecordSMSOTPSent(...)` so the IP-country warning uses the filtered count described above.
102+
103+
### `pkg/lib/fraudprotection/service.go`
104+
105+
- Extend `LeakyBucketer` with the new verified-country recording method.
106+
- Update `Service.RecordSMSOTPVerified(...)` so the verified-OTP flow becomes:
107+
1. parse the phone number
108+
2. write the `sms_otp_verified` metric
109+
3. call `RecordSMSOTPVerifiedCountry(...)`
110+
4. call `RevertSMSOTPSent(..., 1)` to drain the leaky buckets through the existing path
111+
- Leave `CheckAndRecord(...)`, threshold computation, and warning mapping unchanged.
112+
113+
### `docs/specs/fraud-protection.md`
114+
115+
- Rewrite the `SMS__PHONE_COUNTRIES__BY_IP__DAILY_THRESHOLD_EXCEEDED` section to state that the warning counts only countries without a verified SMS OTP in the same 24h window.
116+
- Keep the threshold at `3`.
117+
118+
## Test Plan
119+
120+
### Unit tests
121+
122+
#### `pkg/lib/fraudprotection/leaky_bucket_store_test.go`
123+
124+
- Add coverage for the new verified-country key helper.
125+
- Add a case proving that a verified country is excluded from the IP-country count.
126+
- Add a case proving the verified-country marker respects the same 24h expiry behavior as the existing country set.
127+
- Keep the current regression that proves four unverified countries from one IP still trigger the warning.
128+
- Add a case proving `RecordSMSOTPVerifiedCountry(...)` is called only for actual verification, not for alt-auth cleanup.
129+
- Add a case proving `RevertSMSOTPSent(...)` still drains the buckets and does not mark verified countries.
130+
131+
#### `pkg/lib/fraudprotection/service_test.go`
132+
133+
- Extend the leaky-bucket stub with the new verified-country method.
134+
- Add a test that `RecordSMSOTPVerified(...)` records the verified-country marker and still drains the buckets.
135+
136+
### E2E tests
137+
138+
- Keep `e2e/tests/fraud_protection/sms_phone_countries_by_ip_daily.test.yaml` as the baseline regression for the unverified-country case.
139+
- Add a new e2e test under `e2e/tests/fraud_protection/` that:
140+
- verifies one country first
141+
- sends unverified OTPs to three other countries successfully
142+
- blocks on the 4th unverified country
143+
- proves the verified country does not contribute to the threshold
144+
145+
## Assumptions
146+
147+
- “In the period” means the existing 24h sliding window.
148+
- The verified-country marker is keyed by IP because the warning itself is IP-scoped.
149+
- Old Redis keys can expire naturally; no migration or backfill is needed.
150+
- No config schema, database schema, or generated code changes are required.
151+
152+
## Implementation Order
153+
154+
1. Add the verified-country Redis storage and filtered counting logic in `pkg/lib/fraudprotection/leaky_bucket_store.go`.
155+
2. Wire the new method through `pkg/lib/fraudprotection/service.go`.
156+
3. Update unit tests for store and service behavior.
157+
4. Update the fraud-protection spec text.
158+
5. Add the e2e regression test for the verified-country exclusion case.
159+
160+
## Atomic Commits
161+
162+
1. `fraud: exclude verified countries from SMS IP-country counting`
163+
- Files: `pkg/lib/fraudprotection/leaky_bucket_store.go`, `pkg/lib/fraudprotection/service.go`, `pkg/lib/fraudprotection/leaky_bucket_store_test.go`, `pkg/lib/fraudprotection/service_test.go`
164+
- Scope: storage, service wiring, and unit coverage.
165+
2. `doc,e2e: update SMS IP-country fraud protection semantics`
166+
- Files: `docs/specs/fraud-protection.md`, `e2e/tests/fraud_protection/*.test.yaml`
167+
- Scope: spec wording and end-to-end regression coverage.

docs/specs/fraud-protection.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,8 @@ Examples:
104104
#### SMS__PHONE_COUNTRIES__BY_IP__DAILY_THRESHOLD_EXCEEDED
105105
Check if the number of distinct countries of requested phone numbers from a single IP exceeds the threshold in 24 hours.
106106

107+
Only countries with no verified SMS OTP from the same IP in the same 24h window are counted. If an IP has at least one verified SMS OTP for a country during that window, that country is excluded from the distinct-country count.
108+
107109
The threshold is 3.
108110

109111

@@ -349,4 +351,3 @@ fraud_protection:
349351
hook:
350352
url: authgeardeno:///deno/script.ts
351353
```
352-
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
name: Fraud protection - SMS phone countries by IP daily excludes verified country
2+
authgear.yaml:
3+
override: |
4+
fraud_protection:
5+
decision:
6+
action: deny_if_any_warning
7+
steps:
8+
# Flow 1 - SG number, then successfully verify it. SG should no longer count
9+
# toward the IP-country threshold for the same 24h window.
10+
- name: flow 1 - create (SG)
11+
action: create
12+
input: |
13+
{"type": "signup", "name": "default"}
14+
- name: flow 1 - identify phone (SG)
15+
action: input
16+
input: |
17+
{"identification": "phone", "login_id": "+6591230001"}
18+
- name: flow 1 - send sms (SG)
19+
action: input
20+
input: |
21+
{"channel": "sms"}
22+
output:
23+
result: |
24+
{"action": {"type": "verify"}}
25+
- name: flow 1 - verify otp (SG)
26+
action: input
27+
input: |
28+
{"code": "111111"}
29+
30+
# Flows 2-4 are three distinct unverified countries from the same IP. All should
31+
# still be allowed because the verified SG country is excluded from the count.
32+
- name: flow 2 - create (HK)
33+
action: create
34+
input: |
35+
{"type": "signup", "name": "default"}
36+
- name: flow 2 - identify phone (HK)
37+
action: input
38+
input: |
39+
{"identification": "phone", "login_id": "+85291230001"}
40+
- name: flow 2 - send sms (HK)
41+
action: input
42+
input: |
43+
{"channel": "sms"}
44+
output:
45+
result: |
46+
{"action": {"type": "verify"}}
47+
48+
- name: flow 3 - create (MY)
49+
action: create
50+
input: |
51+
{"type": "signup", "name": "default"}
52+
- name: flow 3 - identify phone (MY)
53+
action: input
54+
input: |
55+
{"identification": "phone", "login_id": "+60123450001"}
56+
- name: flow 3 - send sms (MY)
57+
action: input
58+
input: |
59+
{"channel": "sms"}
60+
output:
61+
result: |
62+
{"action": {"type": "verify"}}
63+
64+
- name: flow 4 - create (TH)
65+
action: create
66+
input: |
67+
{"type": "signup", "name": "default"}
68+
- name: flow 4 - identify phone (TH)
69+
action: input
70+
input: |
71+
{"identification": "phone", "login_id": "+66812340001"}
72+
- name: flow 4 - send sms (TH)
73+
action: input
74+
input: |
75+
{"channel": "sms"}
76+
output:
77+
result: |
78+
{"action": {"type": "verify"}}
79+
80+
# Flow 5 is the 4th unverified country from this IP, so it should be blocked.
81+
- name: flow 5 - create (US)
82+
action: create
83+
input: |
84+
{"type": "signup", "name": "default"}
85+
- name: flow 5 - identify phone (US)
86+
action: input
87+
input: |
88+
{"identification": "phone", "login_id": "+12125550001"}
89+
- name: flow 5 - send sms (blocked)
90+
action: input
91+
input: |
92+
{"channel": "sms"}
93+
output:
94+
error: |
95+
{
96+
"name": "TooManyRequest",
97+
"reason": "BlockedByFraudProtection",
98+
"code": 429
99+
}

pkg/lib/fraudprotection/leaky_bucket_store.go

Lines changed: 47 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -86,21 +86,42 @@ return {new_level, (new_level > threshold) and 1 or 0}
8686

8787
// ipCountriesScript tracks distinct countries seen from a given IP in the past 24h
8888
// using a sorted set keyed by country code with the last-seen timestamp as the score.
89-
// KEYS[1] = sorted set key
89+
// KEYS[1] = sent-countries sorted set key
90+
// KEYS[2] = verified-countries sorted set key
9091
// ARGV[1] = alpha2 country code
9192
// ARGV[2] = now (unix timestamp)
9293
// ARGV[3] = threshold (fixed = 3)
9394
// ARGV[4] = ttl_seconds (2 * 86400)
94-
// Returns {count, triggered_int}.
95+
// Returns {filtered_count, triggered_int}.
9596
var ipCountriesScript = `
9697
local now = tonumber(ARGV[2])
9798
local cutoff = now - 86400
9899
100+
-- 1. Use ZADD to record a send event in sent-countries sorted set key
99101
redis.call('ZADD', KEYS[1], now, ARGV[1])
102+
-- 2. use ZREMRANGEBYSCORE to drop records older than cutoff in both sets before processing
100103
redis.call('ZREMRANGEBYSCORE', KEYS[1], '-inf', cutoff)
104+
redis.call('ZREMRANGEBYSCORE', KEYS[2], '-inf', cutoff)
105+
-- 3. Update the expiry of both set to ensure they are not cleaned up when we still need them
101106
redis.call('EXPIRE', KEYS[1], ARGV[4])
107+
redis.call('EXPIRE', KEYS[2], ARGV[4])
108+
109+
-- 4. Derive counties without at least one verified otp
110+
local sent_countries = redis.call('ZRANGE', KEYS[1], 0, -1)
111+
local verified_countries = redis.call('ZRANGE', KEYS[2], 0, -1)
112+
local verified_lookup = {}
113+
114+
for _, country in ipairs(verified_countries) do
115+
verified_lookup[country] = true
116+
end
117+
118+
local count = 0
119+
for _, country in ipairs(sent_countries) do
120+
if not verified_lookup[country] then
121+
count = count + 1
122+
end
123+
end
102124
103-
local count = redis.call('ZCARD', KEYS[1])
104125
return {count, (count > tonumber(ARGV[3])) and 1 or 0}
105126
`
106127

@@ -170,10 +191,11 @@ func (s *LeakyBucketStore) RecordSMSOTPSent(ctx context.Context, ip, phoneCountr
170191
return err
171192
}
172193

173-
// Update ip_countries ZSET.
194+
// Update ip_countries ZSET and exclude countries with a verified OTP in the same window.
174195
ipCountriesKey := s.ipCountriesKey(ip)
196+
ipVerifiedCountriesKey := s.ipVerifiedCountriesKey(ip)
175197
res, err := conn.Eval(ctx, ipCountriesScript,
176-
[]string{ipCountriesKey},
198+
[]string{ipCountriesKey, ipVerifiedCountriesKey},
177199
phoneCountry, now, ipCountriesThreshold, 2*bucketWindowDaily,
178200
).Slice()
179201
if err != nil {
@@ -225,10 +247,30 @@ func (s *LeakyBucketStore) RecordSMSOTPVerified(ctx context.Context, ip, phoneCo
225247
})
226248
}
227249

250+
func (s *LeakyBucketStore) RecordSMSOTPVerifiedCountry(ctx context.Context, ip, phoneCountry string) error {
251+
now := float64(s.Clock.NowUTC().Unix())
252+
253+
return s.Redis.WithConnContext(ctx, func(ctx context.Context, conn redis.Redis_6_0_Cmdable) error {
254+
return conn.Eval(ctx, `
255+
-- Update the last verified otp timestamp of the country code in the sorted set
256+
redis.call('ZADD', KEYS[1], ARGV[1], ARGV[2])
257+
redis.call('EXPIRE', KEYS[1], ARGV[3])
258+
return 1
259+
`,
260+
[]string{s.ipVerifiedCountriesKey(ip)},
261+
now, phoneCountry, 2*bucketWindowDaily,
262+
).Err()
263+
})
264+
}
265+
228266
func (s *LeakyBucketStore) bucketKey(period int, dimension, value string) string {
229267
return fmt.Sprintf("app:%s:fraud_protection:leaky_bucket:%d:%s:%s", string(s.AppID), period, dimension, value)
230268
}
231269

232270
func (s *LeakyBucketStore) ipCountriesKey(ip string) string {
233271
return fmt.Sprintf("app:%s:fraud_protection:ip_countries:%s", string(s.AppID), ip)
234272
}
273+
274+
func (s *LeakyBucketStore) ipVerifiedCountriesKey(ip string) string {
275+
return fmt.Sprintf("app:%s:fraud_protection:ip_verified_countries:%s", string(s.AppID), ip)
276+
}

0 commit comments

Comments
 (0)