Services

Data Solutions

As the amount of data used in computational research continues to grow, so too does the need for computing solutions that can manage large data sets. The Center for Advanced Research Computing offers solutions for researchers that have more data than they can manage on a personal computer.

CARC's new high-performance computing cluster, Discovery, was designed with big data in mind. A low-latency, high-bandwidth InfiniBand network fabric facilitates intense research workloads on large data sets. Discovery includes a rapidly growing fleet of state-of-the art multicore compute nodes for data-intensive research jobs, backed by two 800 TB, high-throughput scratch file systems. The BeeGFS/ZFS parallel project file system has a capacity of 8.4 PB of usable space, with a default storage quota of 10 TB per Principal Investigator across their projects. Nearly 100 GB of storage space is provided for each user to store important code and configuration files in their home directories. For more information on CARC's file systems, see our Storage File Systems user guide.

For those researchers who might require more storage space, CARC can increase quotas on Discovery to accommodate larger data sets. More information on storage space and quotas can be found on our Accounts and Allocations page.

CARC can help researchers develop an effective data management strategy for their projects, including the use of cloud computing solutions. For more information on our cloud computing services, see our Cloud Computing user guides.

Currently, CARC systems do not support the use or storage of sensitive data. If your research work includes sensitive data, including but not limited to HIPAA-, FERPA-, or CUI-regulated data, see our Secure Computing user guides or contact us at carc-support@usc.edu before using our systems.