It’s not the macros themselves; it’s the usage. And even that doesn’t hide the fact that there’s a missing semicolon; clang-format still indents afterwards (it just doesn’t edit the bracked code code). Luckily, VTK is very good about putting Macro
in macro names, so it was easy to find all the usages (though some may have escaped my search, their telltale diff pattern also doesn’t show up in a scan of the patch).
Status
The rest may be TL;DR for some, so if you just want the diff, I’ve updated the clang-format
branch on my fork. I have the buildbots testing it overnight (kwrobot needs to be told to check formatting before it can be merged and it will likely need a rebase for any other changes anyways). To those interested in some things that were done to make formatting work for VTK and notes for improvements I noticed, feel free to continue.
Manual hinting
I’ve perused the diff and gotten it down to a ~1.7M line patch that removes a net 50k lines (the initial, naïve run added a net 400k lines with a 2.5M line patch) by fixing some things. Things I found:
- imported or generated code (fixed by excluding them from formatting)
- data arrays; I added hints to clang-format to format them the way they probably should be formatted and/or moved them to separate, ignored files.
- hiding manually formatted code from the formatter (usually horizontally aligned line ups for mathematical formulas; usually after applying sensible diffs
clang-format
wanted to make)
“Bad” patterns seen while looking over the diffs
This isn’t too common, but something to watch out for.
- Lots of tests which do SOA-style storage for test data rather than AOS.
- Some data storage does
int foo[][8] = {}
and then has comments for what each element is for. Instead, an anonymousstruct
should be used to give each value set a name. Filters/* and Common/DataModel are the main culprits here. - Lots of point storage all over the place doing
arr[npts * 3]
rather thanarr[npts][3]
. Cell types and such do this a lot. - Lots of arrays that aren’t marked
static
and/orconst
. Most should probably becomeconstexpr
anyways. Happens ~everywhere in Filters/* and Common/DataModel. - Lots of the mathematical formula code doesn’t factor out common subexpressions like
(1 + z) * (1 - z)
and instead recalculate it a dozen times (just search for “derivative” case insensitively and you’ll find most of them). These sometimes, but not always, needed manual formatting hints if you want to look at that commit for a place to start. - There are many places where assignments were formatted in a matrix-style block.
clang-format
has split these into an assignment-per-line; proper matrix data structures should probably just be preferred rather than treatingdouble[9]
as a 3x3 matrix.
I saw lots of code that does:
expr1; expr2; // comment
which became:
expr1;
expr2; // comment
when it should probably be:
// comment
expr1;
expr2;
These are too hard to find mechanically, so I let clang-format
do its thing.
Getting good formatting for data arrays
Things I learned about array formatting doing this:
int arr[] = { 0, 1 };
is well-formatted, but this is the proper format with a trailing comma:
int arr[] {
0,
1,
};
So if you want your arrays to be one element per line, add a trailing comma, otherwise don’t. This is how I hinted to clang-format
how to format various data arrays (the hinting is a separate commit for anyone curious).