Deprecating `printf`-style formatting strings in VTK?

Recently, the merge request https://gitlab.kitware.com/vtk/vtk/-/merge_requests/11940 that modernizes VTK’s use of string-handling functions was merged into VTK. String handling is now faster and more secure. These changes will become part of VTK 9.6.

One aspect of the changes in this modernization is the ability to use std::format-style string formatting patterns for specifying the format of strings generated from numeric values. For example,

vtkAxis::SetLabelFormat(const std::string& format)

can now accept both printf-style formatting strings (e.g., "%.3f, %.3f") as before and also std::format-style formatting strings (e.g., "{:.3f}, {:.3f}").

A question for the community is: should we continue to support printf-style formatting strings in VTK indefinitely, or deprecate support for them? Currently, VTK’s master development branch will warn that printf-style strings passed into functions that take them are deprecated and will be removed in VTK 9.7. This warning was added based on the assumption that we would stop supporting printf-style format strings, which would improve VTK code maintainability. However, we are revisiting that assumption now and are seeking input on how disruptive removing support for printf-style formatting strings would be.

Thank you for your insight on this question.

2 Likes

Is using printf style formatting strings a security / stability issue? Couldn’t you use something like this library? If not I don’t think you want to break backwards compatibility for something that’s probably widely used in examples and other code all over the place. (Bill’s no longer here to rail against breaking changes so I’ll do it for him.)

1 Like

We switched to std::format (using the fmt library) because 1) it’s type-safe, 2) it’s substantially faster. tinyformat would give us the type-safety, but not the speed, plus it’s not part of the standard. std::format is the new c++ way (as of c++20), and since it’s a very well-designed and fast format, i think we should adopt it in the long run.

As of now, if printf-style format is detected at the setter level, it is internally converted to std:;format so that we don’t need need to have spaghetti code everywhere,. and the user is warned.

The question is, should we preserve the detection/potential conversion in every setter even after VTK 9.7 or not?

If we are to preserve it, should we encourage the user (as it happens now) to convert his format string to the std::format, or is that too much nagging?

I think if someone’s concerned about performance then letting them read the documentation to optimize is a fine approach. I just don’t see the point of breaking anyone’s code for the sake of a possible speed difference (which I suspect would typically be slight - I mean, how often do you change the format string compared to the number of times it gets used?).

Don’t forget, VTK gets used in a lot of places, often by people who were not the original authors of the code they use. If they have a python script given to them by some co-worker who’s long-since left and ubuntu updates the VTK package breaking their script what are they supposed to do?

1 Like

My two cents worth is that we should encourage the user to convert their format strings. Mainly because I think VTK code maintainability is important. You don’t want to be maintaining the old way of doing things and the new way of doing things.

Python also flags deprecations now so those users will not have much trouble using something closer to what they already use in their non-VTK Python code.

I think this argument does not stand.

We deprecate many things in VTK all the time and this is a good thing.
Sometimes people complain, most of the time they do not realise.

Software is a changing things that require maintenance.
A script is software and require maintenance as much as any other software.
The only to avoid maintenance is to not change anything, which I do not recommend.

Instead, one should read warning and improve the software they maintain.

That being said, about the specific question of support print format, I think that keeping that specific feature around can definitely be considered as its a widely used.

It is also hard to justify perpetual warnings being given to users that may have no power to change the format string itself. If the conversion code is well-tested enough, I think we can consider it time well spent and just keep it. It would be interesting to have telemetry about the usage, but that is fraught with so many other issues that it’s just a pipe dream.

Definitely, no warnings if we keep it around.

So we should not encourage users to switch to the new format?

Thanks to all for working on this. Modernizing the code is good in general and in particular switching to the safer and more Python-like std::format is a good investment and probably unavoidable in the long term. A few notes:

Deprecation is a good thing? “We deprecate many things in VTK all the time and this is a good thing.” - For a VTK developer, it can feel like a relief (and even a cost saving) to simply deprecate an API and then remove it. However, such changes impose a significant burden on the VTK user community and so they should not be celebrated as inherently good. Even a trivial backward-incompatible API change may cost about $100k for the VTK user community (assuming 2000 VTK-based projects and 30 minutes for finding the issue, figuring out what to change, implement and test the change, prepare a pull request, get it reviewed and merged). Commercial projects reduce this inherent conflict of interest between toolkit users and developers by charging extra for LTS releases (then the cost of backward compatibility is partially put on users who require it), but I don’t think this would be directly applicable to VTK, so we would need to find some more creative solution.

Deprecation schedule: For smaller changes like this, keeping around the deprecated API for 2 years helps, because that allows VTK-based software projects to choose a convenient time to update. It is important to note that the scheduling is not based on VTK versions but on elapsed time: we should not say “X feature will be deprecated in VTK 9.6 and will be removed in VTK 9.7”, but that “X feature will be deprecated in August 2025 and may be removed after August 2027”.

Warnings: If there are no warnings, nobody will notice the change until the deprecated code gets removed and then developers need to rush with updating their code. To ensure that developers are aware of the coming change, it is useful to add warnings (preferably compile-time) that can be suppressed by a CMake option.

Speed of std::format: “String handling is now faster and more secure.” For sure, std::format is more secure. However, it can be up to 5-10x slower than printf variants (especially for floating-point number formatting). If the code was updated with the assumption that std::format was always faster then it may worth looking at the changes again. Performance should be tested for any code parts where formatting may be the bottleneck (e.g., in IO classes that write millions of numbers to text files).

Note that one can #define VTK_DEPRECATION_LEVEL VTK_VERSION_CHECK(9, 3, 0) to not warn about things deprecated after 9.3.0 (unless those things are impending deletion and then we force the deprecation attribute on anyways).

Ideally VTK would stick to the 6 month release cadence so that we can use it as a yardstick. If we want to extend the minimum time before deletion, that is a “simple” policy change; the mechanism doesn’t care (basically we change the release process for which deprecation macro to remove usages of and how we set the VTK_MINIMUM_DEPRECATION_LEVEL define as well).

Speed of fmt::format (the implementation currently being used) is substantially better than the printf family as seen here:

  1. GitHub - fmtlib/fmt: A modern formatting library
  2. format-benchmark/README.rst at master · fmtlib/format-benchmark · GitHub
  3. https://github.com/fmtlib/dtoa-benchmark

Deprecation warnings cannot happen at compile time because it’s a string value that changes, not a function/class. So, they can only be seen at run-time upon usage.

@cory.quammen do you think it’s time for a poll, with 3 options?

  1. Remove the conversion code completely after the deprecation period is over

  2. Keep the conversion code, but continue to encourage users via warnings to switch to the new format

  3. Keep the conversion code, and remove the warnings.

I would not blindly rely on any external benchmarks. Even just manual testing on a few large files (that take more than a few seconds to read/write) could be sufficient, especially if the results clearly show that the performance is improved. Some numbers on real-life performance improvement would also be a nice addition to the release notes.

In the long term, any of the options should acceptable for VTK users, so probably you can go with what is simplest for VTK developers.

The key is to give enough time for VTK users to update their code. Many VTK users upgrade their VTK maybe about once a year (maybe Python users more frequently), and they may not upgrade immediately but after waiting for at least a few months (for potential patch releases to land). If you kept warnings around only for one VTK release cycle = 6 months then many users would not even notice that there was a VTK version that logged some warnings. To ensure users come across the warnings, the deprecation period would need to be about twice as long as the average time period users update their VTK version, so about two years. The “updating VTK once a year” is just my estimation - if you plan to do a VTK user poll than that may be a good question to ask.

The benchmarks you see on the readme file have code that generated them which you can run yourself. I tested several VTK code conversions and fmt was 2x-3x times faster than sprintf/snprintf.

As far as the 2 years, i agree.

How about the following:

  • Remove warnings for now and support both printf and std::format strings
  • Deprecate printf-style formatting strings for the last two releases in VTK 9
  • Removal of printf-style formatting strings in VTK 10

If prediction of when a VTK 10 might come out is too uncertain, then deprecate/warn for two versions (one more release than the usual deprecation period) makes sense to me.

Strike my suggestion above. I think we should just support printf format strings in perpetuity. Aside from maintenance of the code that converts from printf to std::format strings, there aren’t many good reasons to remove support for them.

2 Likes

I posted a merge request removing the deprecation warnings here: https://gitlab.kitware.com/vtk/vtk/-/merge_requests/12442