Research IT (HPC) Specialist
Job Purpose:
This position is responsible for managing the research computing back-end infrastructure systems and related software stack for administration and operations of the High Performance Computing (HPC) Cluster and related computing facilities.
Operational Responsibilities:
- Supervise the operations of HPC resources and conduct routine HPC production operations activities
- Develop and conduct awareness and training workshops on HPCcluster and related components to students, faculty and staff, as well as provide regular one-on-one training sessions on how to use the system and assist in batch scheduling script
- Manage the systems’ users as well as provide end-user support on how to use the resources including assistance in any parallelization, debugging and optimization issues with their applications
- Keep end-users informed about updates in resource issues, system changes, or any other updates
- Liaise with research teams to discuss issues and implement service improvements
- Monitor the performance of the various systems used and their maintenance status
- Conduct weekly checks and updates, when necessary, in order to maximize the up-time of the systems
- Keep track of software licenses and their renewal to ensure continuous availability
- Contact vendors for support issues and ensure their resolution
Location
The campus is located in Masdar City, which is a thriving destination where students, visitors and residents can live, work and conduct various recreational activities.
The MBZUAI campus provides a new model for sustainable living and working. Covering a wide range of purpose-built facilities, it includes student accommodation, laboratories, a knowledge center, an auditorium, a multipurpose hall, gym, canteen, cafes, and retail outlets.