It depends of how the data is structured. HDD don't likes random reads. If data is stored in .tfrecord files then reads are mostly sequential and HDD is a safe choice. On my 4x1080Ti workstation, Resnet50 training in multigpu mode generates 70-100 Mb/s disk traffic. HDD max reading speed is about 120Mb/s (WD RE3). If data is scattered in individual JPEG files, it will be better to buy SSD. HDD ...
Source: Discussion on r/MachineLearning




