Skip to content

Organizations become orphaned when last owner user is deleted #515

@scotwells

Description

@scotwells

Summary

When a User resource is deleted, OrganizationMemberships referencing that user persist with Ready=False (UserNotFound), leaving organizations in an orphaned state with no accessible owner. These organizations cannot be managed by anyone and require manual admin intervention to clean up.

Current Behavior

User Deleted
    ↓
OrganizationMembership persists (Ready=False, UserNotFound)
    ↓
Organization exists with "owner" membership pointing to deleted user
    ↓
No one can access the organization
    ↓
Webhook blocks cleanup because it sees an "owner" membership exists
    ↓
Organization is permanently orphaned ❌

Observed in Production

Three OrganizationMemberships were found with Ready=False and UserNotFound:

  • Memberships referencing user IDs that no longer exist
  • Organizations aged 81-133 days in orphaned state
  • Manual deletion of organizations was required to clean up

Root Cause Analysis

Component File Issue
OrganizationMembership Controller organization_membership_controller.go:131-149 When user not found, marks Ready=False and returns - does NOT delete the membership
OrganizationMembership Webhook organizationmembership_webhook.go:80-117 Blocks deletion of "owner" membership without verifying the owner user actually exists
Organization Controller organization_controller.go No logic to detect orgs with no valid owners
User Controller/Webhook user_controller.go, user_webhook.go No cleanup of dependent resources on user deletion

Proposed Solution

Keep cleanup logic in the ResourceManager domain (not in User/IAM domain) to maintain separation of concerns:

1. Update OrganizationMembership Webhook

Allow deletion when the referenced user doesn't exist:

func (v *OrganizationMembershipValidator) ValidateDelete(ctx context.Context, obj runtime.Object) (admission.Warnings, error) {
    membership := obj.(*resourcemanagerv1alpha1.OrganizationMembership)
    
    // Allow deletion if referenced user doesn't exist
    var user iamv1alpha1.User
    if err := v.client.Get(ctx, client.ObjectKey{Name: membership.Spec.UserRef.Name}, &user); err != nil {
        if apierrors.IsNotFound(err) {
            return nil, nil  // User gone, allow cleanup
        }
    }
    
    // ... rest of existing validation ...
}

2. Update OrganizationMembership Controller

Delete memberships when the referenced user no longer exists:

// In Reconcile, after user not found:
if apierrors.IsNotFound(err) {
    logger.Info("user deleted, cleaning up membership", "user", membership.Spec.UserRef.Name)
    if err := r.Client.Delete(ctx, &organizationMembership); err != nil {
        return ctrl.Result{}, err
    }
    return ctrl.Result{}, nil
}

3. Update Organization Controller

Watch OrganizationMemberships and delete organization when no valid owners remain:

func (r *OrganizationController) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // ... existing code ...
    
    // Check if organization has any valid owners
    if hasNoValidOwners, err := r.checkForValidOwners(ctx, &organization); err != nil {
        return ctrl.Result{}, err
    } else if hasNoValidOwners {
        logger.Info("organization has no valid owners, deleting", "org", organization.Name)
        if err := r.Client.Delete(ctx, &organization); err != nil {
            return ctrl.Result{}, err
        }
        return ctrl.Result{}, nil
    }
    
    // ... rest of existing code ...
}

Expected Flow After Fix

User Deleted
    ↓
OrganizationMembership controller detects user gone
    ↓
Deletes the membership (webhook allows this since user doesn't exist)
    ↓
Organization controller detects no valid owners
    ↓
Deletes the organization ✅

Additional Considerations

  1. Grace period: Consider adding a grace period before deleting memberships/orgs to handle temporary user states
  2. Audit logging: Log organization deletions triggered by this cleanup for audit purposes
  3. Notifications: Consider notifying other org members before deletion if the org has non-owner members

Labels

  • bug
  • resourcemanager
  • iam

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions