Loading the 2D+t image sequence from DICOM should not be a problem. Most DICOM readers should be able to handle it. For example, I know that 3D Slicer (recent Slicer Preview Releases for sure) can read an XA image series into a time sequence that you can visualize and conveniently process using VTK filters or access as numpy array.
I would not use png, mp4, avi file formats. They are generally limited to 8 bits, while you need 10 bits if you export native images and want to do the subtraction yourself. In some cases, compression artifacts might cause problems, too. DICOM export might be disabled in your PACS or image review workstation by default, but you can always request a DICOM export via disk or network.
For the rest of your processing (motion compensation, subtraction, masking, temporal filtering, choice of flow or perfusion metrics, etc.) you can ask clinical experts on the 3D Slicer forum.