Address EditorConfig violations#158550
Conversation
As Cargo.lock and yarn.lock are normally created automatically, we do not want using a different format when editing it manually.
The auxiliary vector files are binary files.
They use all two spaces for indentation. The intrinsic.natvis differed in two ways from the others: - It had a UTF-8 BOM. - It had two multiline comments in a different format. (No new line after <!--)
There are two file extensions for YAML and the '.yaml' currently also use two spaces for indentation.
All JSON and JSON5 files in the repostiory use two spaces for indentation.
Mostly fixing the indentation size (use 4 spaces everywhere and not in some file 2 and some 4 for the same situations). Furthermore, insert final newline for one of them.
In several file types the mixing of different formats (mostly indent_size) seems to be intentional.
All other Cargo.toml files already use 4 spaces for indentation.
There is often alignment which "breaks" the indentation rules. Furthermore, some files use 4 spaces and some 2 spaces, but these indentation errors cannot be separated from the alignments.
All three nix file use 2 spaces for indentation.
The most shell scripts use 4 spaces for indentation. The set of scripts using (only) 2 spaces is smaller.
When git creates or modifies this file, it uses tab characters.
./x test failed early due to too long line
Only one of them had already a final newline. All tests still pass after these insertions (executed with ./x test).
Adjust shell scripts, without .sh extension, to 4 spaces, ignore remaining mixed files and adjust EditorConfig for some files.
|
Some changes occurred in HTML/CSS/JS. |
|
Thanks for the pull request, and welcome! The Rust Project is excited to review your changes, and you should hear from @jdno (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
Why was this reviewer chosen?The reviewer was selected based on:
|
| # For example, the indentation in lists depend on the length of the list marker | ||
| # (such as '- ' or '100. '), at least in the GitHub Markdown flavor: | ||
| # https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#nested-lists | ||
| # Therefore, pressing tab should insert a single space to encourage situative accurate indentations. |
There was a problem hiding this comment.
It is maybe a matter of taste if inserting only one space with a tab is a nice solution for this or not. Another one would be to keep the default indent_size = 4 for Markdown files and ignore them when checking for EditorConfig violations.
| # these specific files have no trailing newline to test on these | ||
| [tests/ui/{lint/unused_parens_multibyte_recovery.rs,parser/{macro/macro-missing-right-paren.rs,missing_right_paren.rs}}] | ||
| insert_final_newline = false | ||
| [tests/{pretty/issue-74745.rs,ui/{parser/issues/issue-{62524,68730,58094-missing-right-square-bracket}.rs,type/issue-91268.rs}}] |
There was a problem hiding this comment.
These files all have the name pattern issue-*.rs. Using the pattern would reduce the maintenance effort, if new such files without final newlines gets added to the project. However, not all files with that pattern are without a final newline.
| tests/run-make-cargo/thumb-none-qemu/example/memory.x | ||
| src/bootstrap/mk/Makefile.in | ||
| src/etc/rust_analyzer_eglot.el | ||
| src/librustdoc/html/static/fonts/README.txt |
There was a problem hiding this comment.
The content of this README looks almost like Markdown. Only some formatting bugs may need to be fixed. Should that be changed to a Markdown file? If the content and file ending would be changed, it would not be necessary to ignore it here.
| library/std/src/sys/pal/sgx/abi/entry.S | ||
| tests/pretty/block-comment-wchar.pp | ||
| tests/run-make-cargo/thumb-none-qemu/example/memory.x | ||
| src/bootstrap/mk/Makefile.in |
There was a problem hiding this comment.
This Makefile uses two styles of indentation. First, for targets of a Makefile you need to use tab characters for indentation. Otherwise, it is syntactically wrong. Second, there is some further code at the beginning of this file and there 2 spaces are in use for indentation. I do not know if using only tab character for indentation would be fine for this file or not.
There was a problem hiding this comment.
Most of the *.check files had no final newline. Although, tests (./x test) still pass on my machine after adding these final newlines, I want to ask if you may prefer to have no final newlines in these files. If so I could remove them from all the *.check files and set insert_final_newline = false for them.
This comment has been minimized.
This comment has been minimized.
Remove all added final newlines as test like tests/ui/argfile/commandline-argfile-badutf8.rs fail in the CI job aarch64-gnu-llvm-21-1 Therefore, set insert_final_newline = false for all the args files for the tests as they seem to have no final newline intentional.
This comment has been minimized.
This comment has been minimized.
|
In tidy we ignore all test files for style lints. At the same time it is still useful to not include it in the editorconfig ignore to increase the chance of new or changed tests having a proper format. Maybe the test changes should be reverted? |
| # these specific files have an UTF-8 BOM to test with that charset | ||
| [tests/ui/codemap_tests/utf8-bom.rs] | ||
| charset = utf-8-bom | ||
| # these specific files have CRLF line endings to test on them | ||
| [tests/{rustdoc-ui/intra-doc/warning-crlf.rs,ui/{frontmatter/frontmatter-crlf.rs,lexer/lexer-crlf-*.rs}}] | ||
| end_of_line = crlf | ||
| # these specific files have an UTF-8 BOM and CRLF line endings to test on these | ||
| [tests/ui/{include-macros/data.bin,json/json-bom-plus-crlf*.rs}] | ||
| charset = utf-8-bom | ||
| end_of_line = crlf |
There was a problem hiding this comment.
This is now a third location (next to .gitattributes and // ignore-tidy-cr in the test itself) where CRLF containing tests are placed. This is bound to go out of sync: #157184
Removing the final newlines in this two files did not undo all changes in these files, which lead to similar errors as before.
| @@ -1,4 +1,4 @@ | |||
| <?xml version="1.0" encoding="utf-8"?> | |||
| <?xml version="1.0" encoding="utf-8"?> | |||
There was a problem hiding this comment.
What changed here? Github is showing mojibake in the original file. Was that a UTF-8 BOM or something?
There was a problem hiding this comment.
This file had the charset utf-8-bom instead of utf-8. Therefore, I removed the BOM, which is not visible in the diff here. (I do not see any mojibake here. GitHub shows me no visible difference in this line.)
This could be an approach, but the options |
|
Editor autoformat can also result in unintended changes as well as editor defaults that already insert a final newline and trim trailing whitespace like I have myself. You have to be careful what changes you commit either way. Unsetting |
|
You are right, better would be to set |
This Pull Request is part of an academic research project. More preciously it is for my master's thesis, more background information here.
Definition: EditorConfig violations
The goal of this Pull Request is to address all EditorConfig violations in the project. With EditorConfig you have set for example
insert_final_newline = truefor most files, but there were files without a final newline. I call this an EditorConfig violation as a file content violates the respective EditorConfig property. They can have different kind of impacts. For example trailing spaces, which get remove with the next save, may only make the next diff / Pull Request to this file less readable. Some other may lead to bugs, such as adding final newlines to test files, which are not allowed to have them.How to address them?
To check for such violations and fix them I developed the CLI tool ecformat. However, there are different ways to address the violations:
insert_final_newline = falsefor them, which avoids automatic insert of a final newline on save.To do so
ecformat fixcan be helpful.ecformat checkfinds. I added a.ecformat_ignorefile for this. It has the common ignore syntax such as a.gitignore.Do use such ignore files, provide the name of them with an CLI switch,
ecformat check --ignore-file .ecformat_ignore. (The.gitignorefiles are automatically considered inside a git repository.) There are different reasons to ignore files:indent_size = 4set, but to align some code with the previous line an indentation / alignment uses for example 5 spaces. Detecting such cases as alignment and not as indentation needs language specific parsing, which is out of scope for a generic EditorConfig utility. Such files need to be ignored fully byecformatas it currently (version 0.2.0) has no support to ignore only a single property for some files.Changes in this Pull Request
For this Pull Request, I addressed all violations in the project in the way, which looks the most correct one to me. (I focused on the Properties of the current EditorConfig specification 0.17.2.)
If I fixed file contents, I manually reviewed all of them, making adjustments in the
ecformat fixchanges were necessary. Furthermore,ecformatdetects the charset wrong for some files. Therefore, I would suggest disabling this property for whole project, with--disable-charsetor-c. (I plan to improve the charset detection in a later version.)If I picked the wrong way to address the violations for some files, please let me know with a respective Review comment.
Then I will address it in the correct way. I add Review comments myself for special parts, which are especially worthy of discussion. To support the Review of this big Pull Request, I tried to group the changes in meaningful commits with respective commit messages. Therefore, may take a look at the commit history.
Avoid EditorConfig violations in the future
Such EditorConfig violations can creep in again. Possible reasons can be that the editor of some contributors has not EditorConfig support, or it would be required to install a respective plugin for the support.
If you want to avoid them in the future, I would suggest integrating a respective utility. For example as part of the CI, some pre-commit scripts or similar places. One possibility is to use
ecformatwithecformat check --disable-charset --ignore-file .ecformat_ignore. Another one is editorconfig-checker, which is a more sophisticated tool without the possibility to fix violations. It provides more configuration options, such as ignoring violations in single lines with marker comments.Let me know if you are interested using one of them. I could add one in this Pull Request or a follow-up Pull Request.
If you not consider to use
ecformatin your project, the.ecformat_ignorefile is maybe not worth to merge into main. I can remove it if desired.