If your image stack consists of time sequence of 2D images (2D space + time) and 3D volume (3D space) then it is OK to have them in separate files. If they are photos (3x8 bit RGB) then it is OK to store them in PNG.
What is odd that normally it is not meaningful to visualize a 2D+t stack as a 3D image. What is your application domain (clinical imaging, microscopy, geo, etc.)? What kind of images do you work with and how are they acquired?