Machine Learning & Artificial Intelligence

This category covers architecture guidance for building and operating AI workloads on Open Telekom Cloud, with a focus on platform engineering for MLOps and AIOps. The articles are aimed at architects and platform teams who design shared cloud platforms that support machine learning and AI-driven operations in a structured and repeatable way.

Here, you'll find articles that dive into how AI and LLM workloads fit into a cloud platform rather than how individual models are built. We look at architectural patterns for providing GPU-backed compute, scalable data access, and standardized environments for training, deployment, and inference. MLOps is treated as a platform concern, covering topics such as reproducible pipelines, environment consistency, and controlled model promotion across stages.

In addition, the category includes guidance on applying AIOps concepts to the operation of AI platforms and underlying cloud infrastructure. This includes the use of telemetry and automation to improve observability, capacity management, and incident response in complex, Kubernetes-based environments.

Overall, these articles are intended to help platform engineering teams design AI-capable cloud architectures that are maintainable, scalable, and aligned with existing cloud and DevOps operating models on Open Telekom Cloud.