Initial Preparations
- 1-day kickoff meeting at CDL to plan work and bring all actors up to speed
- SDSC educates CDL, IRC, and UCB staff (and UCLA and UCR) on lower level networking challenges and issues (eg, Inca), and SDSC lessons learned with LoC
- SDSC works with CDL and UCB to instrument and install network monitoring tools for all potential partner pathways
- CDL works with UCLA and UCR to (confirm and) stage large collection files for transfer testing:
- 2 TB of Frontera collection audio files
- 10 TB of California Digital Newspaper collection files
- CDL prepares local machines to receive collection files
Initial transfer tests
- Define appropriate transfer tools.
- Install, test and configure tools at all sites.
- Install appropriate instrumentation.
- Transfer varying amounts of data over varying amounts of time to verify procedures.
- Refine and document (wiki) configurations, noting experience installing and using the tools.
Complete collection data transfer
- Work with all parties to schedule transfers
- Begin complete instrumented transfer of Frontera collection
- Begin complete instrumented transfer of UCR collection
- Report via Wiki on Best Networking Practices, addressing:
- baseline network capabilities needed for transfers.
- baseline monitoring tools for individual sites
- configurations and tuned network setups at all sites.
- configurations and monitoring procedures.
Study of remote replication
- Investigate questions on optimum number of replicas between organizations.
- Transfer tests from CDL to SDSC.
- Examine how auditing, fixity checking, and reporting would best function in shared environments.
- Examine impacts on performance (e.g. speed, reliability) of various storage configurations (e.g. NAS, DAS, clustered NAS, etc.)
Reporting and Documentation
- Report and publish final recommendations, with broad general goals including
- Create a shared environment in which the various sites are able to provide services to each other when needed.
- Design appropriate replication storage strategies; test strategies using the above tools to verify synchronization and trustworthiness to insure data integrity.
- Define and study broader path forward goals for the institutions based on the work in this project. Provide examples of future work that could be accomplished and demonstrations of how this work could benefit the larger NDIIPP community.
- (optional) Plan and organize a workshop bringing together stakeholders in big storage and fast transfer.
|