ExCL March Meeting 2025

3 minute read

March 2025 ExCL meeting slides.

It appears that you don't have a PDF plugin for this browser. You can click here to download the PDF file.


Below is a LLM-generated summary of the meeting based on the presentation slides.


Monthly Meeting Highlights - Experimental Computing Laboratory (ExCL) - March 2025

The March 2025 meeting of the Experimental Computing Laboratory (ExCL) brought together key insights and updates on various aspects of the laboratory’s infrastructure, upcoming plans, and enhancements to resources and documentation. Here’s a summary of the key topics discussed:

1. Infrastructure/System Status

The infrastructure is in good shape with no major outages planned. However, there are a few notable points:

  • Arista Networking: The current networking infrastructure is now out of support. Replacement quotes are being gathered.
  • TRC Availability: While the Translational Research Capability (TRC) is available, there will be some things to take care of like networking before moving in.
  • Mi300a Issues: A replacement part was misplaced, and work is underway to resolve this.
  • CentOS7 Systems: The last CentOS7 infrastructure system is still in operation, but upgrades are being processed.

2. Plans for the Next Three Months

Several exciting updates are planned for the next quarter, including:

  • Continued documentation and capability updates.
  • Deployment of the ARC Pro A60 card and AMD Xilinx U280.
  • Moving the rack to D102, pending infrastructure work (e.g., power distribution and networking improvements).

3. Power Consumption Monitoring

The current power consumption monitoring capabilities are limited. Some highlights:

  • PDU Monitoring: Available but limited in scope.
  • Host Instrumentation: Generally not great, with occasional power data available from BMC for power supplies.
  • GPU Instrumentation: For Nvidia and AMD, GPU monitoring is generally excellent, though work is required to integrate AMD glue code into Check_MK.

4. Meeting Notifications and Groups.io

ExCL User Group meetings are announced through their Groups.io mailing list. You can easily add these meetings to your calendar through the .ics file attachment in the notification email.

5. Reminder: SSH Keys for Authentication

ExCL encourages using SSH keys for authentication, citing advantages over passwords:

  • SSH keys provide better security and reduce the risk of account lockout after multiple failed login attempts.
  • Visual Studio Code and GitHub Docs provide guides to help set up SSH keys for remote development.

6. Documentation Updates

Several important updates were made to the ExCL User Documentation:

  • Jupyter Notebook and Julia quick-start guides have been updated.
  • New guides were introduced for Marimo, Modules, Backup & Storage, and ExCL Remote Development.

7. Conda Deprecation and Suggested Alternatives

ExCL has announced the deprecation of the Conda module and use at ORNL, with detailed instructions for transitioning to other tools such as Spack. Documentation on this change is available for users to help navigate the transition smoothly.

8. ExCL Project Storage

ExCL now offers shared storage for collaborative projects, managed through Access Control Lists (ACLs) to ensure security. Projects get a dedicated subvolume on the ZFS filesystem, and access is restricted to members of the project.

9. Open WebUI and Foxy Proxy Setup

The Open WebUI provides an interface for managing multiple ExCL tools, while Foxy Proxy is a useful tool for accessing services like ThinLinc, Jupyter, and Marimo. Documentation is available for setting up both tools.

10. Questions, Discussions, and Comments

Some open issues were discussed, including:

  • Apple M1 Availability.
  • An issue with M100 memory.
  • Links for additional information were shared on Creative Commons Licensing and further documentation.

The ExCL team continues to work on enhancing their infrastructure and documentation to better support users. The upcoming months promise exciting new deployments and important updates. Make sure to stay informed through the ExCL mailing list, monthly meetings, and their documentation updates!