image
  • Snapboard
  • Activity
  • Reports
  • Campaign
Welcome ,
loadingbar
Loading, Please wait..!!

Infrastructure Architect

  • ... Posted on: Feb 27, 2026
  • ... microTECH Global LTD
  • ... Edinburgh, Indiana
  • ... Salary: Not Available
  • ... Full-time

Infrastructure Architect   

Job Title :

Infrastructure Architect

Job Type :

Full-time

Job Location :

Edinburgh Indiana United States

Remote :

No

Jobcon Logo Job Description :

Job Title: AI Infrastructure ArchitectLocation: Edinburgh, ScotlandType: Permanent On-Site Working Required, No Sponsorship Provided Responsibilities:Design a unified AI Infra & Serving architecture platform for composite AI workloads such as LLM Training & Inference, RLHF, Agent, and Multimodal processing. This platform will integrate inference, orchestration, and state management, defining the technical evolution path for Serverless AI + Agentic Serving Design a heterogeneous execution framework across CPU/GPU/NPU for agent memory, tool invocation, and long-running multi-turn conversations and tasks. Build an efficient memory/KV-cache/vector store/logging and state-management subsystem to support agent retrieval, planning, and persistent memory. Build a high-performance Runtime/Framework that defines the next-generation Serverless AI foundation through elastic scaling, cold start optimization, batch processing, function-based inference, request orchestration, dynamic decoupled deployment, and other features to support performance scenarios such as multiple models, multi-tenancy, and high concurrency. Key Requirements:Strong foundational knowledge in system architecture, or computer architecture, operating systems, and runtime environments;Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscalingvLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillationProficient in using Profiling/Tracing tools; experienced in analyzing and optimizing system-level bottlenecks regarding GPU utilization, memory/bandwidth, Interconnect Fabric, and network/storage pathsProficient in at least one system-level language (e.g., C/C++, Go, Rust) and one scripting language (e.g., Python)If you're interested in applying, please reach out to

View Full Description

Jobcon Logo Position Details

Posted:

Feb 27, 2026

Reference Number:

19584_4373575064

Employment:

Full-time

Salary:

Not Available

City:

Edinburgh

Job Origin:

APPCAST_CPC

Share this job:

  • linkedin

Jobcon Logo
A job sourcing event
In Dallas Fort Worth
Aug 19, 2017 9am-6pm
All job seekers welcome!

Infrastructure Architect    Apply

Click on the below icons to share this job to Linkedin, Twitter!

Job Title: AI Infrastructure ArchitectLocation: Edinburgh, ScotlandType: Permanent On-Site Working Required, No Sponsorship Provided Responsibilities:Design a unified AI Infra & Serving architecture platform for composite AI workloads such as LLM Training & Inference, RLHF, Agent, and Multimodal processing. This platform will integrate inference, orchestration, and state management, defining the technical evolution path for Serverless AI + Agentic Serving Design a heterogeneous execution framework across CPU/GPU/NPU for agent memory, tool invocation, and long-running multi-turn conversations and tasks. Build an efficient memory/KV-cache/vector store/logging and state-management subsystem to support agent retrieval, planning, and persistent memory. Build a high-performance Runtime/Framework that defines the next-generation Serverless AI foundation through elastic scaling, cold start optimization, batch processing, function-based inference, request orchestration, dynamic decoupled deployment, and other features to support performance scenarios such as multiple models, multi-tenancy, and high concurrency. Key Requirements:Strong foundational knowledge in system architecture, or computer architecture, operating systems, and runtime environments;Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscalingvLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillationProficient in using Profiling/Tracing tools; experienced in analyzing and optimizing system-level bottlenecks regarding GPU utilization, memory/bandwidth, Interconnect Fabric, and network/storage pathsProficient in at least one system-level language (e.g., C/C++, Go, Rust) and one scripting language (e.g., Python)If you're interested in applying, please reach out to

Loading
Please wait..!!