Repflow Lambda Case Interview Instructions

Context

This exercise focuses on the AWS Lambda found in lambda/lambda_function.py. It processes inbound emails received via Amazon SES and S3, classifies and parses potential sponsorship opportunities with AI agents, persists conversation data via a backend API (lambda/api_client.py), conditionally creates deals, and may send a reply via SES.

Your goal is to demonstrate your ability to:

Read and understand unfamiliar code quickly.
Infer the business use case and data flow.
Identify design patterns and pitfalls.
Propose and justify fixes and improvements.
Design and write tests using mocks/doubles for external systems.

Timebox your work as appropriate (suggested total: 2–3 hours). Prioritize clarity and judgment.

Folder Contents You’ll Use

lambda/lambda_function.py — Main Lambda logic and AI orchestration.
lambda/api_client.py — Async client for backend API calls (aiohttp).
lambda/package.py — Packaging helper for Lambda (reference only, do not run).
lambda/database/ — Data models (reference only; Lambda uses API, not DB directly).

You do not need the rest of the repository to complete this exercise.

Environment Assumptions (Do Not Call Real Services)

You will NOT invoke real AWS or HTTP calls. Use mocks.
Expected environment variables:
- API_BASE_URL
- API_AUTH_TOKEN (required by create_api_client() in api_client.py)
- S3_BUCKET
- SES_REGION
- OPENAI_API_KEY (agents are mocked; no real calls)

High-Level Flow (What the Lambda Does)

Key entry points and helpers in lambda/lambda_function.py:

lambda_handler(event, context) → async_lambda_handler(event, context)
Parse event: parse_ses_event() → builds EmailEvent (message_id, thread_id, sender/recipient, subject)
Fetch email body: fetch_email_from_s3(bucket, key)
Early exits/gates:
- Already processed: was_email_processed() / mark_email_processed()
- Blacklist: is_blacklisted()
- No-reply detection via regex
Store inbound message: store_message() (SimpleMessage via API)
Build contextual conversation history: get_conversation_history() + format_conversation_history()
Run FSM for deal + conversation: run_deal_conversation_fsm()
- Agents: deal_filter_agent, deal_parser_agent, comprehensive_preference_agent, qa_polisher_agent
- Qualification logic: check_qualification_criteria()
- Possible deal creation: create_deal_from_qualified_lead() and APIClient.create_conversation_for_deal()
Conditional reply via SES: send_ses_reply()

Also present (for reference): process_multi_agent_qualification() (legacy pipeline-style flow), not the primary path.

Your Tasks

1) Code Comprehension and Business Inference

Deliver a concise system diagram or bullet flow covering the steps above.
Summarize the business use case: qualifying inbound brand sponsorships, gathering missing info, evaluating against creator preferences (User.preferences), creating deals, continuing the conversation, and replying.

Artifacts to submit:

A short write-up (Markdown) with the diagram/flow and business summary.

2) Design Patterns and Trade-offs

Identify patterns and discuss trade-offs with specific references to functions/classes:

Orchestration/Pipeline: run_deal_conversation_fsm() vs process_multi_agent_qualification()
State Machine: LeadDealState, ConversationState (from database.models.lead)
DTOs/Validation: LeadQualificationData, Deal, QAPolisherResponse, EmailEvent
API Client abstraction: APIClient methods (get_lead_by_thread_id, create_deal, etc.)
Utilities: extract_email_address, normalize_subject_for_thread, generate_thread_id, extract_budget_value

Artifacts:

Bullet list of patterns and trade-offs with code citations.

3) Pitfalls, Edge Cases, and Risks

Provide a prioritized list with suggested mitigations. Examples to consider (not exhaustive):

ID field inconsistency: deals sometimes use id vs leads using _id (see create_deal_from_qualified_lead() vs log lines in get_lead_by_thread_id()).
Signature mismatch: qa_polisher_agent forbids signature but send_ses_reply() appends one.
Async correctness: sync boto3 inside async handler (fetch_email_from_s3, send_ses_reply) may block.
Idempotency placement: mark_email_processed() timing vs side effects.
S3 availability: missing or delayed object should be handled gracefully before agent calls.
Thread ID design: hashing normalized subject + sorted emails can collide for similar threads.
Packaging/runtime mismatch: package.py pins Python 3.13 which may not match AWS supported runtime.
Logging/PII: printing full event payload.

Artifacts:

Risk list with code references and your proposed mitigations.

4) Testing Design and Targeted Unit/Integration Tests

Write tests with mocks/doubles. Do not call real services.

Pure function unit tests:
- extract_email_address() (various header formats)
- normalize_subject_for_thread() (strips Re/Fwd variants, whitespace)
- generate_thread_id() (direction/case invariance; subject variations)
- extract_budget_value() (currencies, commas, ranges → lower bound)
- determine_deal_type_from_budget() (affiliate/hybrid/ugc/flat)
- is_partnership_type_accepted() (acceptance matrix)
- format_conversation_history() output markers
FSM path tests (mock agents + APIClient):
- Deal Filter → Sponsorship high confidence vs Other/low confidence
- Deal Parser populates some fields, leaves others missing
- Preference Agent returns CONTINUE/NEGOTIATE/REJECT
- Verify run_deal_conversation_fsm() transitions and side effects, including deal creation when check_qualification_criteria() passes
Handler integration-ish tests (async):
- Already processed → early 200
- Blacklisted/no-reply → mark processed and 200
- Existing qualified lead with conversation → create_message_for_conversation() path and fallback to store_message()
- New Sponsorship thread → INFO_GATHERING ask or finalization; ensure store_message() and reply logic; mark_email_processed() called
- SES send success/failure: outbound message stored only on success

Artifacts:

Test plan write-up and code (prefer pytest + pytest-asyncio).
Use mocks for:
- agents.Runner.run(...) returning instances compatible with DealFilterResponse, LeadQualificationData, PreferenceEvaluationDraftResponse, QAPolisherResponse
- APIClient methods (get_lead_by_thread_id, create_lead, update_lead, create_deal, get_conversation_history, create_conversation_for_deal, create_message_for_conversation, create_simple_message, check_email_processed, mark_email_processed, check_email_blacklisted)
- boto3 S3/SES calls in fetch_email_from_s3 and send_ses_reply

5) Proposed Fixes and Small Refactors

Provide a short design note with specific, scoped changes (cite functions/lines):

Align data contracts: standardize id vs _id for deals/leads throughout the Lambda.
Signature policy: remove or make signature optional in send_ses_reply() to match qa_polisher_agent rules.
Idempotency: move mark_email_processed() earlier and add idempotent inserts on message creation.
Resilience: add retries/backoff to APIClient._make_request() for transient errors.
Async correctness: consider aioboto3 or run sync boto3 calls in a thread executor.
Config hygiene: require env vars; reduce PII logs; fix package.py runtime to match Lambda.
Observability: structured logs keyed by thread_id; basic counters for outcomes.

Artifacts:

A brief Markdown with your proposed changes and rationale.

Sample SES Event Payload

Use this as a starting point for tests of parse_ses_event() and the handler. Timestamps may vary.

{
  "Records": [{
    "ses": {
      "mail": {
        "messageId": "abcdef123",
        "timestamp": "2025-10-08T21:06:52Z",
        "commonHeaders": {
          "from": ["Brand Rep <rep@brand.com>"],
          "to": ["creator@repflow.app"],
          "subject": "Sponsorship opportunity for Q4"
        }
      },
      "receipt": {}
    }
  }]
}

Submission

Please provide:

Your Markdown write-ups (Stages 1–3 and 5) in a single SOLUTION.md.
Your tests (a tests/ folder is fine) with instructions to run them locally using mocks. Prefer pytest (pytest-asyncio for async), and indicate your Python version.
Brief notes on assumptions and any scope you intentionally left out due to time.

We will evaluate:

Clarity and correctness of your understanding.
Practicality and prioritization of your critiques and fixes.
Thoughtful testing strategy with good isolation.
Communication quality and technical judgment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repflow Lambda Case Interview Instructions

Context

Folder Contents You’ll Use

Environment Assumptions (Do Not Call Real Services)

High-Level Flow (What the Lambda Does)

Your Tasks

1) Code Comprehension and Business Inference

2) Design Patterns and Trade-offs

3) Pitfalls, Edge Cases, and Risks

4) Testing Design and Targeted Unit/Integration Tests

5) Proposed Fixes and Small Refactors

Sample SES Event Payload

Submission

FilesExpand file tree

CANDIDATE_INSTRUCTIONS.md

Latest commit

History

CANDIDATE_INSTRUCTIONS.md

File metadata and controls

Repflow Lambda Case Interview Instructions

Context

Folder Contents You’ll Use

Environment Assumptions (Do Not Call Real Services)

High-Level Flow (What the Lambda Does)

Your Tasks

1) Code Comprehension and Business Inference

2) Design Patterns and Trade-offs

3) Pitfalls, Edge Cases, and Risks

4) Testing Design and Targeted Unit/Integration Tests

5) Proposed Fixes and Small Refactors

Sample SES Event Payload

Submission