Secure Inference Serving Stack

End-to-end serving stack for privacy-preserving machine learning inference.

This project investigates how to operationalize privacy-preserving inference in a production-like serving environment. It combines runtime scheduling, operator-level optimization, and practical deployment constraints.

Key contributions include:

  • latency-aware orchestration for encrypted operators
  • service-level performance profiling tools
  • reproducible deployment scripts for benchmark workloads