About the configuration or Reader/Writers that support multiple file formats

In VTK, they are a few reader/writers that internally supports multiple file formats.

eg:

  • vtkGLTFReader supports .gltf and .glb
  • vtkOCCTReader supports .stp and .igs

However, the actual reading of these files is usually different depending of the format, so the reader must use the right method to read. There are different approaches to that. I see the following:

  1. Parse the filename extension to automatically choose the right behavior (vtkGLTFReader does that)
  2. Let the user specify which code path to take using a setter/getter (vtkOCCTReader does that)
  3. Let the user specify wich code path to take but add a “Automatic” mode that behave like one and make this the default
  4. Use mime types

I think 1. is working until it doesn’t for reasons, as file extensions are bad to detect the actual format of a file. The simple case were it breaks is a format using multiple file extensions were one if forgotten or more confidential. Also some file format just do not have extensions so 1. would not be usable in that case.

  1. is fine too but makes the reader hard to integrate in applicative layer, eg, to integrate the OCCT reader in ParaView we need to create two different “reader proxy”, one for each format, so not very nice.

  2. would work but requires a bit more code, see below.

  3. is not realistic, also mime types are not a thing in Windows world (I think)

So here is a proposed standard API trying to implement 3.

  enum Format : unsigned int
  {
    FOO,
    BAR
  };

  vtkGetMacro(ManualFileFormat, unsigned int)
  vtkSetClampMacro(ManualFileFormat, unsigned int, Format::FOO, Format::BAR);

  vtkSetMacro(AutomaticFileFormat, bool)
  vtkGetMacro(AutomaticFileFormat, bool)
  vtkBooleanMacro(AutomaticFileFormat)

  unsigned int ManualFileFormat;
  bool AutomaticFileFormat;

  //  Should be in the internal pimpl
  unsigned int FileFormat;
  if (this->AutomaticFileFormat)
  {
      std::string extension = vtksys::SystemTools::GetFilenameLastExtension(this->FileName);
      if (extension == ".foo")
      {    
          this->FileFormat = FOO;
      }
      else // "bar"
      {
         this->FileFormat = FOO;
      }
   }
   else
   {
      this->FileFormat = this->ManualFileFormat;
   }

With such an implementation, user have full control when needed but also simplify the developpement of the applicative layer.

I know. nothing special in that code but since it may applies on multiple reader, just trying to make sure I’m not missing anything.

@spyridon97 @cory.quammen

1 Like

@alexy.pellegrini

Automatic and Manual are too generic for me.

I would rename AutomaticFileFormat with InferFromExtensionand ManualFileFormat with WantedFileFormat

We could get rid of WantedFileFormat all together and modify FileFormat if InferFromExtension is true but there is a less strong argument for that.

1 Like

I like the approach - automatic with the possibility to override.

This reminds me of a bug I filed against the imageio Python package, which has a generic imwrite function for writing images, when I couldn’t write JPG with non-standard extension (in my case “.jpg.part”) using that library. They fixed it by allowing the user to override the automatic backend selection (Can not imwrite a JPG with e.g. ".part" suffix · Issue #781 · imageio/imageio · GitHub).

I think the field names are confusing. this->AutomaticFileFormat is a boolean whereas this->ManualFileFormat is a file type (string).

If this->AutomaticFileFormat were changed to this->GuessFileFormat (normally false) and this->ManualFileFormat were changed to this->DefaultFileFormat, the following logic would be simpler; where ExtractFileFormat() could be a virtual method.

  this->FileFormat = this->DefaultFileFormat;
  if (this->GuessFileFormat)
  {
    std::string extension = vtksys::SystemTools::GetFilenameLastExtension(this->FileName);
    this->FileFormat = ExtractFileFormat(extension);
  }

To simplify things, automatic file format (guess from extension) could be just one the file formats:

enum Format : unsigned int
  {
    AUTO,
    FOO,
    BAR
  };

vtkGetMacro(FileFormat, unsigned int)
vtkSetClampMacro(FileFormat, unsigned int, Format::AUTO, Format::BAR);

unsigned int FileFormat{AUTO};

Thanks for the valuable inputs. I mostly agree with most of it with the exception of @lassoan suggestion.

While it does simplify the API and achieve close to the same results, it makes it harder to differentiate between the default automatic mode and format mode on the application side.

eg, in ParaView, you may want to have a GUI that looks like this:

 [X]  Use automatic format detection

 Manual Format [ FOO/BAR ]  (visible only when auto is checked off)

Seperating the two concept makes it much easier to implement such an interface.