ExCL April Meeting 2025
April 2025 ExCL meeting slides.
Below is a LLM-generated summary of the meeting based on the presentation slides.
Monthly Meeting Highlights - Experimental Computing Laboratory (ExCL) - April 2025
The ExCL April 2025 meeting showcased progress on infrastructure, upcoming projects, and a range of updates to ExCL’s resources. Here’s a breakdown of the key points discussed:
1. Infrastructure/System Status
Several key updates were shared regarding the current infrastructure:
- news Command: A new shell command (
news
) has been deployed, providing up-to-date information on system statuses. This will allow users to access real-time updates on ExCL’s infrastructure. - Mi300a: Deployment is still experiencing delays due to parts availability, but a new motherboard has been installed. Networking issues are being addressed by CTG Federal.
- Automated Account Reporting: Significant work has been done to automate account reporting and management, reducing manual oversight and improving efficiency.
- Vulnerability Management: Gateway machines will now auto-update every Thursday night at 10 PM, with automatic reboots if necessary, ensuring systems remain secure.
2. news Command Output
A snapshot of the news
command provided the latest updates:
- Mi300a Parts Delay: Parts for the Mi300a deployment are still pending.
- Equinox GPU Error: An error with the Equinox GPUs was reported. Reseating the NVLink cabling is planned to resolve this, with the fix scheduled for the week of April 21.
3. Python/Marimo Automated Account Database Linting and Reporting
ExCL introduced a new automated account database linting and reporting system built on Python and Marimo. This system enhances the accuracy of account data and simplifies reporting tasks, making the process more efficient and less prone to human error.
4. Plans for the Next Three Months
ExCL outlined its plans for the upcoming quarter:
- Continued documentation and capability updates.
- Deployment of the ARC Pro A60 card and AMD Xilinx U280.
- Ongoing work on the deployment backlog, focusing on lower-priority devices.
- Preparation for the Translational Research Capability move from 5100/227 to 3700/A102. This includes infrastructure work like moving racks to D102, enhancing power distribution, and addressing networking needs.
5. Meeting Notifications and Groups.io
ExCL keeps users informed about upcoming meetings via their Groups.io mailing list. Meeting notifications include an .ics
calendar file to easily add events to your calendar. For Outlook users, the calendar invite can be accessed directly from the email preview.
6. Documentation Updates — Remote Development Roadmap
The Remote Development guide has been updated to reflect best practices and new workflows for remote access to ExCL systems. This includes tools for development outside the ExCL environment:
7. Open WebUI
The Open WebUI provides a user-friendly interface for managing ExCL systems, making it easier for users to interact with tools and services. Updated documentation is available to guide users through its features:
8. ExCL Documentation Updates
Several updates have been made to the ExCL User Documentation:
- Jupyter Notebook and Julia guides have been refreshed.
- New documentation for Marimo, Modules, and ExCL Remote Development.
- A new section on Backup & Storage, including Project Storage, has been added, providing detailed instructions on managing and securing project data.
- Jupyter Guide
- Julia Guide
- Conda and Spack Installation Guide
- Marimo Guide
9. ExCL Project Storage
ExCL provides shared project storage for collaborative work, offering secure access to project subvolumes on the ZFS filesystem. Key features include:
- Access via automounted NFS share at
/auto/projects/<project_name>
. - Strict access control to ensure security, with Access Control Lists (ACLs) managing permissions.
- Only project members have read, write, and execute access to their specific project directory.
- Project directories cannot be listed by unauthorized users.
10. Questions/Discussions/Comments
During the meeting, some additional topics were discussed:
- Summer Intern Accounts: A reminder to use the
Interns2025
project ID for interns, along with any additional project identifiers. - Project/Shared Storage: More information was shared about shared storage options for collaborative projects.
The ExCL team continues to work on enhancing their infrastructure and documentation to better support users. The upcoming months promise exciting new deployments and important updates. Make sure to stay informed through the ExCL mailing list, monthly meetings, and their documentation updates!