[Feature]:Support for non-NVIDIA backends (e.g., Iluvatar CoreX )

### 🚀 The feature, motivation and pitch

## Background
I am very interested in the Unified Cache Management (UCM) project. Currently, it provides an excellent way to persist and reuse KV Cache to speed up LLM inference. In the current landscape of diverse hardware, adapting such cache management logic to domestic GPUs (like Iluvatar CoreX / 天数智芯) is becoming increasingly important.

## Question
I would like to know if there are any plans or architectural considerations for supporting non-NVIDIA backends. Specifically:

1. **Hardware Abstraction:** Does the current implementation of UCM heavily rely on NVIDIA-specific features (e.g., CUDA VMM API, specific NVLink behaviors)?
2. **Framework Dependency:** Does UCM require a specific version of vLLM or other engines that are strictly tied to CUDA?
3. **Porting Effort:** In your opinion, what are the most critical modules that need to be rewritten or abstracted to support the Iluvatar CoreX software stack (which uses the DeepLink/CoreX SDK)?

I have access to Iluvatar CoreX hardware and would love to hear your thoughts on the feasibility of this adaptation.


### Alternatives

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]:Support for non-NVIDIA backends (e.g., Iluvatar CoreX ) #667

🚀 The feature, motivation and pitch

Background

Question

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]:Support for non-NVIDIA backends (e.g., Iluvatar CoreX ) #667

Description

🚀 The feature, motivation and pitch

Background

Question

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions