Location: Karadzicova 14 Bratislava,
InoCloud is a fast-growing company specializing in providing cost-effective and flexible AI training solutions for businesses and researchers. Our GPU-accelerated infrastructure empowers our customers to scale their AI capabilities without any vendor lock-in. By taking care of all the challenges of owning and operating on-site hardware, we enable our clients to focus on what truly matters – advancing their AI projects.
We are seeking a talented and experienced AI Solution Engineer to join our team, primarily focusing on ML Ops, Data Center DevOps, and customer support. In this role, you will play a key part in automating processes, optimizing hardware maintenance, and providing top-notch support to our customers. If you have a passion for AI, cutting-edge technology, and a knack for helping customers succeed, this is the perfect opportunity for you.
- Design, develop, and maintain ML Ops and Data Center DevOps processes to improve efficiency, reliability, and scalability of InoCloud’s infrastructure.
- Collaborate with the development team to create and implement automated processes for deployment, monitoring, and maintenance of AI training infrastructure.
- Ensure the continuous availability and performance of GPU accelerators to meet customer demands and service level agreements.
- Provide expert technical support to customers, addressing their questions and resolving issues in a timely and professional manner.
- Collaborate with cross-functional teams to identify and implement hardware maintenance and optimization strategies.
- Stay current with industry trends and best practices in ML Ops and Data Center DevOps to continuously improve InoCloud’s infrastructure and offerings.
- Create and maintain clear documentation of processes, procedures, and system configurations.
- Have demonstrated ability to create infrastructure, tooling, and end to end ML systems that facilitate rapid turnarounds for ML research teams
- 3+ years of experience in Data Center/ Cloud DevOps, or a similar role
- Strong knowledge of AI/ML technologies and GPU-accelerated infrastructure.
- Proficiency in scripting and programming languages such as Python, Bash, or Go.
- Experience with containerization and orchestration technologies, such as Docker and Kubernetes.
- Familiarity with cloud platforms, such as AWS, Azure, or GCP.
- Understanding of networking principles, including routing and reverse proxy configurations.
- Proficient in configuration management tools and version control systems, such as Git.
- Excellent problem-solving skills and a customer-focused mindset.
- Strong written and verbal communication skills.
- Ability to work independently and as part of a team in a fast-paced, dynamic environment.
- Fluent in English – spoken/written
- Competitive salary and benefits package.
- Opportunity to work with cutting-edge technology and an innovative team.
- Professional growth and development opportunities.
- Flexible working hours and remote work options.
- A collaborative and supportive company culture.