• No results found

Research Data Storage, Sharing, and Transfer Options

N/A
N/A
Protected

Academic year: 2021

Share "Research Data Storage, Sharing, and Transfer Options"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

Research Data Storage, Sharing, and Transfer Options

Principal investigators should establish a research data management system for their projects including procedures for storing “working data” collected during the conduct of the research. The PI should communicate these procedures to all group members. The procedures should ensure that the PI is able to access all data produced by the research group and must meet all applicable security requirements. Below is a list of options for the storage, sharing, and transfer of digital research data. A glossary of the terms used in the summary is located at the end of this document.

Serv ice Co st Cap aci ty Ar ch ive Wo rk in g Dat a Backed up Ac ce ss Co n tr o l PHI /PI I Cer ti fi ed Secure Versi on Co n tr o l Shareabi lit y Publ ic Ac cess Courseworks

(SAKAI) $1/GB/Yr 500 MB-5GB N Y Y Y Y N Y Y

CUI

T

Research Storage

Service (RSS)+ $1/GB/Yr 100 GB- 2TB N Y Y Y N N N N

CUMC IT Storage $878/TB/Yr Increments in

TB to UNL Y* Y Y Y Y N Y N* CUMC IT FTP

Server $258/MB

500 MB of storage up to 5

users*

N Y N Y Y* N Y N SharePoint $920/site/Yr 10 GB/site N Y Y Y Y Y Y N

CUM

CI

T

Virtual Servers $6,813/server/Yr See below N* Y Y Y Y* N/A Y N

CDRS

Academic Commons

Free up to 10 GB/file (UNL # files), $5/GB/file

after 10 GB/file

Up to 100

GB/file Y N Y N N N Y Y Data Storage See below See below Y Y Y Y Y N Y N High Performance

Computing See below See below N N N Y N N Y N Virtual Servers See below See below Y Y Y Y N Y Y N Web Hosting See below See below Y Y Y Y N Y Y Y

AR

C

S

Colocation See below See below N N N Y N N N N

Co lu mb ia Un iv er si ty A ff ili at ed

Google Drive Free up to 15 GB, or $10/user/Mo for

UNL

Up to 15 GB or

UNL* N Y N Y N N Y N

Amazon Cloud Drive Upload/download fees, 5 GB- free, or (#GB/2=$ per

yr)

5 GB-1000 GB Y Y Y Y N N Y N Figshare Free up to 1GB,

see below for pricing options

Private space: 1GB-20GB, Public Space:

UNL

Y Y Y Y N Y Y Y

Co

mmer

ci

al

Dropbox See pricing

options below 2GB-UNL Y Y Y Y N N Y N

Ot

her

PC/Mac Server (w/

options)* Depends on cost of machine

As much space as willing to

pay for

(2)

+ : in pilot mode, contact CUIT for more information *: See description below

**Note: prices and availability may vary as noted in table above, see websites below for most up-to-date pricing and availability

CUIT

Courseworks (SAKAI): Collaboration environment to communicate, share documents, collaboratively edit and organization tools like calendar, membership, announcements,

documents, and group work. Brings forward web 2.0 tools including blog, wiki, chat, and RSS to support group and team collaboration. Requires active UNI and completion of a training course. http://services.cuit.columbia.edu/sakai-collaboration-system

CUIT RSS: Able to purchase both personal and/or research group space. Researcher can purchase personal space and up to two group storage spaces. Valid UNI and password are required for

access. Data is backed-up in 30 day snapshots. Not certified for PHI and HIPPA materials, PII not certified but could be in the near future.

http://services.cuit.columbia.edu/research-storage-service-pilot

CUMCIT

CUMC IT Storage: HIPPA compliant. Included with storage is “drop box” solution. Archive solution as long as client continues to fund the storage space. Public access is in process of being implemented. https://secure.cumc.columbia.edu/cumcit/secure/howto/remote/index.html

CUMC IT FTP Server: Ideal for transient storage, such as transfer of large files. Access to server set up by CUMC IT, does not require a UNI or MC account. Intended for temporary storage. Currently under review for PHI/PII certification.

https://secure.cumc.columbia.edu/cumcit/secure/storage.html

SharePoint: CUMC IT offers SharePoint 2010 web sites for Medical Center groups and departments. A SharePoint site provides an intuitive area for collaboration online, including document, calendar and list sharing with only an approved MC Domain account required. These sites are managed within your group, allowing for granular levels of access based on your needs. Cannot recover files once deleted and files remain on site as long as client continues to pay.

http://www.cumc.columbia.edu/it/getting_help/sharepoint.html

Virtual Servers: Ideal for running applications or programs. Infrastructure powered by VMWare. Content remains on server as long as client continues to pay for services. Individual servers may become PHI/PII certified. CUMC IT sets up, installs, and regularly monitors servers. Regular backups are performed to a secure off-site location using Symantec NetBackup. Typical setups include the following with additional storage and customization available:

Windows - 80GB of hard disk space, 4GB of RAM, on Windows Server 2008 R2 Linux/LAMP - Linux, Apache, MySQL, and PHP

(3)

CDRS

Academic Commons: Digital repository for Columbia University faculty, students, and staff and affiliates. Maintained by Center for Digital Research and Scholarship at Columbia University

Libraries. Any digital content can be uploaded and is freely available to the public. A URL is given to each document uploaded so that it is citable.

http://academiccommons.columbia.edu/  

http://support.academiccommons.columbia.edu/knowledgebase/articles/35783-can-i-deposit-my-data-in-academic-commons

ARCS

Advanced Research Computing Services (ARCS) website:

http://systemsbiology.columbia.edu/advanced-research-computing-services

Advanced Research Computing Services (ARCS) Data Storage: Storage services for a number of applications, ranging from desktop file storage to high-performance computing applications. A HIPAA compliant Isilon clustered file system provides over 1 PB of high-speed, redundant storage for our compute clusters and user data. A secondary Isilon clustered file system provides daily replication of valuable data to a secondary site. A large, scalable tape robot and pair of backup servers provided automated backups of all relevant storage to tape for long-term backup. See website below for more information:

http://systemsbiology.columbia.edu/data-storage

For pricing information: http://systemsbiology.columbia.edu/advanced-research-computing-services-pricing  

 

Advanced Research Computing Services (ARCS) Virtual Server Hosting: A robust server

virtualization environment that is available for both Linux and Windows servers. Virtualization servers can be configured for application, web, database, development and other miscellaneous purposes. This service provided by ARCS is a hosted based service; owners will have full control of content and administration of their virtual server. For pricing information:

http://systemsbiology.columbia.edu/advanced-research-computing-services-pricing

Advanced Research Computing Services (ARCS) High Performance Computing: ARCS offers a large Linux based computer cluster featuring 6,336 CPU cores, GPU enhanced computing and two high memory systems with 1TB RAM each. Pricing for CPU time is determined on a CPU per hour basis.

For pricing information: http://systemsbiology.columbia.edu/advanced-research-computing-services-pricing  

Advanced Research Computing Services (ARCS) Web Hosting: The web hosting services ACRS offer are comprised of three virtual web servers and a virtual web development server. ARCS web infrastructure sits behind a load balancer to stabilize web traffic flow. If your website requires a

database ARCS also offer database hosting for an additional charge. This service provided by ARCS is a hosted based service; website owners have the control of their site content and applications. For pricing information: http://systemsbiology.columbia.edu/advanced-research-computing-services-pricing  

Advanced Research Computing Services (ARCS) Colocation: The 3,000 sq. ft. Irving Cancer Research Center's Data Center provides a state-of-the-art server hosting environment, including fault tolerant network configurations, a 10 Gbps Ethernet core, direct links to the Columbia University

(4)

backbone, and advanced power and cooling. Equipment hosting comes in two tiers: high density and low density. High-density racks are equipped to power and dissipate 20 kW per rack and are intended for high-performance computing applications; low-density racks provide 5 kW of power and cooling per rack, and offer an ideal environment for traditional server hosting.

For pricing information: http://systemsbiology.columbia.edu/advanced-research-computing-services-pricing  

Commercial

Google Drive: Beginning August 1st, 2014, Columbia University will offer Google Drive, only for users that have a LionMail account. Other users will require a Gmail account to set-up Google Drive. Users can upload and store anything. Encrypted using SSL. Files are kept private until the user “invites” others to view selected files. Users can invite others to view files by entering valid email addresses for shared users. It is not intended as a permanent archive space for data, fees may be associated with the retrieval of old data files. Once a file is deleted, it cannot be recovered.

http://www.google.com/drive/index.html

Amazon Cloud Drive: Any digital content can be uploaded and accessed remotely. Amazon Cloud Drive is compatible with all Amazon devices. There are several templates available for customization, including version control. 5 GB is storage is free to all Amazon users. Pricing plans begin at 20 GB of storage, for $10 a year. The cost of storage is the number of GB divided by 2 per year. For example, for 100 GB of storage costs $50/year, 200 GB of storage costs $100/year, and so on.

https://www.amazon.com/clouddrive/learnmore#features-section

Figshare: A cloud based system for securely managing research data. Any type of data can be uploaded with options for sharing with select people or making information publically available,

discoverable, and citable. Many file formats are capable of being visualized within Figshare’s website, without users requiring specific software. Data is backed up in multiple institutions around the world, DOIs provided by DataCite, content is hosted on Amazon Web Services, which provides virtually limitless file storage and fast upload and download times. Fulfills public access requirements for many funders and publishers.

Figshare Pricing Plan

Free $8/month $11/month $15/month Private Storage 1 GB 10 GB 15 GB 20 GB File Size Limit 250 MB 500 MB 500 MB 1 GB

Collaboration Spaces 1 3 3 3

Collaborators 5 7 12 20

Public Space Unlimited Unlimited Unlimited Unlimited For other pricing options, customers are encouraged to contact Figshare.

http://figshare.com/

Dropbox: Files can be uploaded to Dropbox, then can be edited from any location and shared. Everything is private until user chooses to share with other parties. Files are secured with256-bit AES encryption and two-step verification. Three different pricing plans are available. Dropbox uses “refer a friend” benefits to acquire additional space.

Basic- Free up to 2GB

Pro- $9.99/month up to 100 GB

Business- Additional administration features and version control for 5 or more users for $15/user/month for unlimited space.

(5)

Other

PC/Mac Server: PIs and researchers may choose to set up their own private server housed within their research laboratory. Users who choose this option need to contact CUIT to set up a static IP address for the machine. The PI is responsible for the maintenance of the server, performing back-ups, and access control. Special considerations need to be considered for PHI/PII and other sensitive information, including keeping the server in a locked facility and ensuring properly functioning

firewalls. For large amounts of data storage (several TB) can purchase networked attached storage (NAS) or other devices which can be on its own, or connected to server. Can elect to have an FTP, see below.

http://www.wikihow.com/Build-a-Fileserver https://www.apple.com/mac-mini/server/

FTP Server: Uses a client-server design. FTPs are often secured using SSL/TLS. Many FTP options are available, including free services. PIs should exercise caution when choosing a server to transfer their research data, because they can be vulnerable to hacking.

http://www.slideshare.net/mwGSU11/choosing-an-ftp-client-8294642 http://www.mediacollege.com/internet/ftp/clients.html

Glossary

Access Control- PI has the ability to control who can view, alter, upload, and download content. Access is secured with user name and password.

Archive- All data can be saved for long-term (permanent) storage.

Backed-up- Data is regularly backed-up automatically with the PI involvement.

PHI/PII Certified- Certified by CUIT/CUMCIT for being the highest level of security possible for PHI and/or PII information.

Public Access- Options to make data publicly available to fulfill funders and/or publishers requirements.

Shareability- Able to share certain data, as decided by PI, with collaborators from all over the world.

UNL- Unlimited

Version Control- Protocol set in place for versioning of data files being used by multiple users.

Working Data- Data produced that is in preparation for publications, grant submissions, presentations, etc. that has not been formally published.

References

Related documents

Once the initial bioinformatics steps including data acquisition, organization and storage are completed, the data-driven phase of bioinformatics begins. Although there are

Chapter 11 explores theories that have been applied to the study of tourism and sustainable community development, as well as theories from other fields that might be adapted to

The synorogenic strata on the west side of the basin closest to the Beartooth Range, also called proximal deposits, are variously mapped as upper Paleocene Fort Union Formation

The purpose of this study is to explore human capital productivity strategies used by THL business leaders in southern Nigeria that have improved employee productivity. This

house. Sally : The neighbors were French. in the office.. at home in London last week. It was lovely there. at Mike’s party in Oxford in the summer. all your friends / at your

Your child/student is invited to be in a research study of middle school student understanding and perception of differentiated instruction. This study is being conducted in an

8.4 Data collected in health care and biomedical research contexts are not intrinsically more or less ‘sensitive’ than other data relating to individuals, but the medical context in