Services
The Center for Advanced Research Computing (CARC) offer a variety of services to help our users elevate their research work and achieve unprecedented scientific breakthroughs.
Currently, CARC systems do not support the use or storage of sensitive data. CARC is in the process of building a secure computing environment. Until that environment is ready, if your research work includes sensitive data, including but not limited to HIPAA-, FERPA-, or CUI-regulated data, see our Secure Computing Compliance Overview or contact us at carc-support@usc.edu before using our systems.
For more information on the different types of sensitive data, see the USC IT Services Data Security page.
0.0.1 Computing
The Center for Advanced Research Computing (CARC) offers state-of-the-art high-performance computing clusters, cloud computing services, and more. We are constantly expanding our services and outreach to provide the USC research community with the tools they need to conduct transformative research.
0.0.1.1 High-Performance Computing
Our high-performance computing resources are the backbone of everything we do at the Center for Advanced Research Computing (CARC).
→ Discovery cluster
CARC launched its high-performance computing cluster, Discovery, in August 2020. The Discovery cluster marks a significant upgrade to CARC’s cyberinfrastructure, and the first step in a major, user-focused overhaul of the program. This cluster includes additional compute nodes and a rebuilt software stack, as well as new system configurations to better serve CARC users. Discovery consists of 2 shared login nodes and a total of around 20,000 CPU cores in around 500 compute nodes. Of these, over 200 nodes are equipped with graphics processing units (GPUs) with a total of over 180 NVIDIA GPUs available. The typical compute node has dual 8 to 16 core processors and resides on a 200 Gigabits-per-second (Gbps) NDR InfiniBand backbone.
Discovery includes an array of scientific software packages, both licensed and open source, for engineering, molecular simulation, and computational chemistry. Researchers can also install software packages or develop their own code within their project’s allotted storage.
Discovery is free to use for all USC faculty, research staff, and graduate students (with the approval of their faculty advisor). For detailed information on Discovery’s computing resources, see the Discovery Resource Overview.
→ Endeavour condo cluster
In an effort to provide the most comprehensive support to the USC research community, CARC built the Endeavour condo cluster allowing researchers a way to customize their high-performance computing experience.
The Condo Cluster Program (CCP) was launched in December 2020 to provide service to USC researchers that require dedicated resources for their work. Compute nodes leased through the CCP form CARC’s Endeavour condo cluster. The CCP gives researchers the convenience of having their own dedicated compute nodes, without the responsibility of purchasing and maintaining the nodes themselves. The CCP operates on two different models - an annual subscription model and a traditional system purchase model - to provide researchers with flexible and efficient options for their resources. All hardware is purchased and maintained by CARC throughout the course of the lease or subscription term.
For more information on the CCP, including details on the two purchase models and pricing, see the Condo Cluster Program pages.
→ CARC OnDemand
CARC’s OnDemand service provides users with web access to the Discovery and Endeavour HPC clusters, including file storage systems. OnDemand offers:
- Easy file management
- Command line shell access
- Slurm job management
- Access to interactive applications, including Jupyter notebooks and RStudio Server
OnDemand is available to all users. For more information on how to use this service, see the CARC OnDemand pages.
→ Data transfer nodes
CARC has two dedicated, high-speed, 100 Gbps data transfer nodes available that are especially useful for large transfers. The Discovery and Endeavour login nodes have a 40 Gbps connection speed, which are adequate for most transfers.
For more information on CARC’s data transfer services, see the Data Management pages.
→ Software stack
CARC offers a comprehensive software stack on both the Discovery cluster and the Endeavour condo cluster for our users. The software stack allows users to find and load software using the Lmod module system.
Some of the available software includes, but is not limited to, Apptainer, MATLAB, Mathematica, and COMSOL. For more information on CARC’s software stack, see the Software pages.
→ Operating system
CARC uses a customized distribution of Rocky Linux 8, based off of RHEL 8 (Red Hat Enterprise Linux). Rocky 8 is a high-quality Linux distribution that gives CARC complete control of its open-source software packages and is fully customized to suit advanced research computing needs, without the need for license fees. It was created to fill the role that CentOS previously played in research computing environments.
Official Rocky Linux documentation can be found on the here.
0.0.1.2 Cloud Computing
The Center for Advanced Research Computing (CARC) offers a private, on-premises cloud platform for USC researchers. Additionally, we provide consulting to help faculty and campus researchers obtain access to and effectively use cloud computing resources through its relationship with Amazon Web Services (AWS) as a preferred vendor.
CARC can help:
- Determine if the cloud is a good fit to meet your research requirements.
- Plan for and manage costs of custom cloud solutions.
- Gain access to cloud computing resources, both for research and research-related instruction.
→ Artemis private cloud computing
CARC launched Artemis in August 2023 as a cost-effective and comprehensive solution for cloud computing at USC.
Artemis is CARC’s private, on-premises cloud computing platform. Artemis complements existing CARC systems and services (Discovery and Endeavour clusters, file systems, etc.) by offering researchers access to virtual machines (VMs) on which they can run alternative operating system environments and deploy resources. Built on OpenNebula, Artemis provides a variety of virtual machines (VMs) and microVMs for our users.
The development of Artemis was made possible through an award received from the 2020 NSF grant “CC* Compute: A Customizable, Reproducible, and Secure Cloud Infrastructure as a Service for Scientific Research in Southern California” (NSF award # 2019220).
For more information, see the Artemis user guides.
→ Amazon Web Services
Amazon Web Services (AWS) is USC’s preferred and recommended cloud provider. AWS provides a broad set of infrastructure services, such as computing power, storage options, networking, and databases, all delivered as a utility: on-demand, available in minutes, with pay-as-you-go pricing. All of these resources are maintained in secure data centers in multiple geographic locations.
CARC works closely with public cloud vendors to provide research teams with cost-effective cloud access while meeting federal compliance regulations (e.g., HIPAA, FERPA).
To get started, see our AWS Account Setup page.
0.0.1.3 Life Sciences Computing
The Center for Advanced Research Computing (CARC) provides access to life sciences resources, such as reference genomes and protein and nucleotide sequences databases.
In 2021, CARC completed a collaboration between Amgen, Dornsife, and ITS to establish access to two cryogenic electron microscopy (cryo-EM) instruments for USC researchers. The microscopes utilize a comprehensive data management and computational processing platform developed by ITS and CARC.
→ Cryo-EM microscopes
USC has two cryo-EM microscopes available for use: Krios with the K3 direct detection camera and Glacios with the Falcon 4 direct detection camera, both manufactured by Thermo Fisher. Both instruments are housed in the Michelson Center for Convergent Bioscience building in the Core Center for Excellence in Nano Imaging (CNI) at USC’s University Park campus.
The comprehensive computational environment for cryo-EM data processing includes:
- Automation of data extraction and transfer to CARC storage.
- Automation of data delivery to Amgen’s cloud storage.
- GPU cluster system deployment.
- Development of cryo-EM data pre-processing platform using Pegasus Workflow Management System.
- Development of cryo-EM user portal with integrated Slack user notification feature.
The high degree of automation during the data processing, extraction, and transfer process is a huge benefit for researchers making use of the microscopes, and not typically available in cryo-EM workflows. In particular, the integration of Slack as a means to view pre-processed images from the microscopes in near real time is valuable for monitoring purposes.
Detailed information on creating a cryo-EM project and using the microscopes can be found in the Cryo-EM user guide.
→ Databases and other resources
Life sciences resources are available on both Discovery and Endeavour clusters. Users can access them either through copying the desired path via the Bio Resources user guide or going to /project2/biodb/resourcename
and choosing the desired resource. Listed below are some of the popular resources we offer. If you need a specific resource that is not currently listed, please submit a help ticket and we will try to make those resources available to you.
0.0.1.3.1 Genomes
A set of ready-to-use reference sequences and annotations for commonly analyzed organisms, sourced from iGenomes.
0.0.1.3.2 Genbank
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations.
0.0.1.3.3 Genome Taxonomy Database (GTDB)
The Genome Taxonomy Database (GTDB) is an initiative to establish a standardized microbial taxonomy based on genome phylogeny.
0.0.1.3.4 Pfam database
The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).
0.0.1.3.5 TIGRFAMs
TIGRFAMs is a resource consisting of curated multiple sequence alignments, Hidden Markov Models (HMMs) for protein sequence classification, and associated information designed to support automated annotation of (mostly prokaryotic) proteins.
0.0.1.3.6 UniProt
The Universal Protein Resource (UniProt), a collaboration between the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics, and the Protein Information Resource (PIR).
0.0.2 Data
The Center for Advanced Research Computing (CARC) provides a comprehensive file system structure and data management solutions for all our researchers’ needs.
→ Files systems
Every user has access to three file systems through their CARC account: /home1, /project2, and /scratch1.
Nearly 100 GB in a ZFS/NFS file system of storage space is provided for each user to store important code and configuration files in their home directories. Additionally, Principal Investigators (PI) receive a maximum of 10 TB to be distributed across their projects in the all-NVME flash storage /project2 file system. A rapidly growing fleet of state-of-the art multicore compute nodes is available for data-intensive research jobs, backed by a high-throughput scratch file system. A low-latency, high-bandwidth 200 gigabits-per-second (Gbps) InfiniBand network fabric facilitates intense research workloads on large data sets.
CARC clusters are shared resources. As a result, there are quotas on usage to help ensure fair access to all USC researchers as well as to maintain the performance of the file systems. For more information on storage quotas, allocations, and pricing, see the Project and Allocation Management pages.
→ Data management
CARC provides high-speed data transfer nodes and a variety of useful tools to achieve secure and efficient data transfer depending on whether the storage location is a personal computer or an external site (e.g., cloud storage). Different tools offer solutions for data sensitivity requirements and varying levels of user familiarity.
The three main methods of data transfer that CARC supports are command-line tools, graphical tools, and the Globus service.
Due to security risks, please be mindful of the type of information being transferred. Where possible, omit all information that may be considered confidential. For examples of confidential information that requires additional consideration, visit the ITS Sensitive and Confidential Information page.
→ Data preservation
The Center for Advanced Research Computing (CARC) protects users’ data via snapshots and backups on certain directories. For larger-scale data archiving, CARC offers its own Cold Storage System or can facilitate services between researchers and the USC Digital Repository.
0.0.2.0.1 Snapshots and backups
CARC keeps snapshots of the /home1, /project, and /project2 directories for two weeks. If some files in these directories are deleted and they were captured in the snapshot, they can be recovered. Additionally, the /home1 directory is backed up daily on our on-site system.
The snapshots are a semi-backup. If data in /home1, /project, and /project2 is created and deleted in between snapshots (i.e. in a one-day period), it will not be recoverable. Please keep additional backups of important data.
If files need to be recovered, submit a help ticket and the CARC team will attempt to locate them.
0.0.2.0.2 Data archiving
Principal Investigators (PIs) can request an allocation to CARC’s Cold Storage System in the user portal. Cold storage is intended for long-term (e.g., more then 5 yrs) storage of large data sets (TB to PB scale). It is a fee-based service platform at a current rate of $20/TB/year.
For more information on how to use cold storage, see our Data Preservation user guides.
CARC’s Cold Storage System preserves one copy of the stored data in one location with no regularly performed data integrity checks. PIs interested in multiple copies of their data and integrity checks should use the USC Digital Repository for their data archiving needs instead. Please submit a help ticket and the CARC team will assist you in facilitating this service.
→ Data science and analysis
The Center for Advanced Research Computing (CARC) offers data science support for USC researchers through extensive guides, user consultations, and workshops.
There are a variety of ways to run data science scripts on CARC systems. Researchers have the option to submit job scripts by running Anaconda in batch mode, using a Apptainer or Docker container, or running an interactive JupyterLab session on CARC OnDemand.
Our Data Science user guides provide details for each method, as well as information on how to use Apptainer.
CARC can facilitate the use of several popular packages for data science applications and data analysis. More details on each of the packages listed can be found in our Popular Data Science Packages page.
Check our workshops to see the list of classes available and our current schedule.
For questions and more individualized support, submit a help ticket and one of our Research Facilitators will assist you.
0.0.3 Grant development support
The primary mission of the Center for Advanced Research Computing (CARC) will always be to support USC researchers in their work. To that end, CARC offers assistance to researchers who are seeking funding in the form of external grants from research centers and federal agencies, such as the National Science Foundation (NSF) and the National Institutes of Health (NIH).
In addition to providing computing resources, CARC can directly assist researchers with the grant proposal process. CARC will gladly provide letters of support for grant proposals from researchers who have utilized CARC resources in their research projects. To discuss letters of support, data management plans, facilities and resources documents, or hardware quotes for your grant, please contact us at carc-support@usc.edu.
If you are submitting a grant proposal for research that will utilize CARC resources, you will likely require some background information about CARC’s resources and facilities. You are welcome to use the following document in your proposal:
Link to Facilities and Resources document
0.0.4 Acknowledging CARC in your publications
All forms of publication, including web pages, resulting from work done using CARC resources should include the following citation:
The authors acknowledge the Center for Advanced Research Computing (CARC) at the University of Southern California for providing computing resources that have contributed to the research results reported within this publication. URL: https://carc.usc.edu.