Size of Data Sets

How much information is in your hands.

Another fundamental question to consider when visualizing data is: "How large is the data set?" Or put differently, "How much space does it occupy on the hard disk?" This might affect the performance of your computer at the moment you work with it. Nowadays, the most sophisticated scanners, microscopes and cameras produce very high definition (HD) images which consume a lot of space in memory. Opening a large 3D data with 4K HD quality will likely use most of your machine's RAM, and it will affect user interaction and lower the frames per second (fps).

How to work with Data Sets over the 4 GB

Nowadays, even a fairly average personal computer can have 16 Gigabytes (GB) of RAM. Powerful gaming stations can have up to 64 GB. Although, the amount of RAM keeps increasing over the years, the size of data does as well—but at an even faster pace. Advance tech devices acquire high definition data producing files over the 4 GB, and a whole data set easily can exceed 30 GB.

RAM is shared across all the applications on your machine, as well as with the operational system. For example, your browser can consume 3 GB, plus the music player, plus videos, plus the applications you typically work with. Leaving between 8 or 5 GB of free RAM to work with other files. Opening a single 4 GB dataset takes almost half of your available memory. At this point your computer will start slowing down affecting its performance, some applications will close automatically and you might regret opening the file.

Getting More RAM

One solution to this problem is simply buying more RAM. At the moment this article was written the maximum RAM a standard desktop can have is 128 GB. More advance computers can get up to 2 TB. Also adding more CPU cores and GPUs might do the trick. However, the main obstacle for this solution is money.

Cons

RAM is expensive, and not all motherboards can support a high amount of it. So, a modern motherboard has to be bought as well. Also, GPUs are not cheap either, and there is a point when the price of a high tech computer with the latest gadgets gets beyond any budget.

Please take all the mentioned details in count at the moment of going with this option. If you need ANY HELP or more information about the right specifications, please DO contact CCV Visualization group sending a email to support@ccv.brown.edu.

Pros

A high tech desktop computer with over 1 TB of RAM is extremely useful for ANY type of job. Large an heavy operations would not depend on external factors such as network lag and resource allocation. Specialized software can get results faster and export it to multiple formats such as video in real time.

High Performance Computer (HPC) Cluster

Distributed computing architectures and Parallel programming can help loading large datasets at a cheap cost. Multiple institutions (educational and private) offer cloud services where users can rent/request RAM, CPU and GPU from computers hosted in HPC facilities.

Using this kind of service reduce the complexity of your desktop computer, and all the updates will be held on the server side. High tech hardware and software is available without downloading them or installing them in your local device.

However, the only requirement is to have an internet connection. Also, it could be offered as a subscription services (involving monthly fees) depending on the amount of resources to work with.

Brown University offers HPC services for multiple tasks. For more information please follow the link

Divide and conquer

Large datasets can be subdivided in multiple parts and distributed within the HPC (multiple nodes), where specialized software can synchronize the split data, and send the result frame to a external device (user's local device). This solves the problem of visualizing large datasets, making it easy and cheap to analyze them.

Cons

The internet connection on the user's side must be reliable and the more bandwidth the better. Studying large datasets can take some time and losing the connection in the middle of the process could make you start from zero (This is not true for batch jobs). Slow connections can deteriorate the interaction with the data and frustrate any attempt to analyze it.

The quality of the server side side also impacts the process. If multiple users request resources at the same time, some of them will have to wait until more resources are available. The more CPUs and GPUs you can access the better. However, obtaining more computational power might imply additional fees. Additionally, there are times when the servers are out of service for maintenance and updates, so users will be affected by this downtime.

Last updated