Lead MLOPs Engineer
ref nr: 115/12/2024/AJ/89273In Antal we have been dealing with recruitment for over 20 years. Thanks to the fact that we operate in 10 specialised divisions, we have an excellent orientation in current industry trends. We precisely determine the specific nature of the job, classifying key skills and necessary qualifications. Our mission is not only to find a candidate whose competences fit the requirements of the given job advertisement, but first and foremost a position which meets the candidate’s expectations. Employment agency registration number: 496.
Technology stack
• GCP (must have!) , BigQuery, Cloud Storage, Apache Airflow, Cloud Composer,
· Vertex AI, Dataproc, Compute Engine
• CI/CD and Build tooling: Terraform, Terragrunt, Jenkins, Groovy, Crane, Kaniko
• Python, PySpark, Docker, Jupyter, Apache Airflow, Spark, Java (optional, but would
· be beneficial)
Key Responsibilities
• Establish and maintain best practices for ML Ops. Including version control, CI/CD
• pipelines and the Vertex Al Model Registry and End Points.
• Implement MLOps tools to streamline model development, training, tuning,
• deployment, monitoring and explain.
• Deploy and Manage ML models on GCP's Vertex Al platform ensuring efficient and
• scalable execution.
• Identify and address performance bottleneck in ML models and pipelines.
• Troubleshoot and resolve ML issues ensuring optimal model performance and
• costs. Work Closely with Compliance Analytics data scientists to prepare and
• preprocess data for model training and evaluation.
• Assist in feature engineering and selection to ensure model performance
• Develop techniques to visualize and explain model behavior ensuring model transparency and accountability in-line with PRA S51/23 guidelines.
• Collaborate with infrastructure and DevOps teams to establish efficient deployment and scaling strategies.
Pipeline Development:
• Build and maintain robust pipelines for model training, tuning and deployment leveraging components of Vertex Al and GCP tooling like Cloud Composer utilizing Python and Java and Big Query.
• Implement automated monitoring and alerting to track model performance and identify potential issues.
• Develop and maintain data quality checks and validation including reconciliations in-line with Data Quality and Retention Controls.
• Implement robust security measures to protect sensitive data and models.
Required Skills and Experience:
• Strong proficiency in ML Ops principles and tools.
• Proficiency in data engineering and pipeline development.
• Experience with GCP including Big Query, Cloud Composer and Vertex Al.
• Strong problem-solving and analytical skills.
• Strong proficiency in Python
• Experience with Java would be beneficial.