This is an automated archive made by the Lemmit Bot.

The original was posted on /r/programminglanguages by /u/RamblingScholar on 2024-10-18 15:11:19+00:00.


Everyone has a preferred source code formatting method. While you can reformat a file when you pull it from source code management, that will DESTROY change management and tracking, since it suddenly looks like every line is changed.

Also, being able to quickly read and share source code thanks to a common format is invaluable. I realize this and why having some standard, no matter what, is better than having the “best standard”. But it would be nice to be able to see source code always formatted in your preferred form.

I know ASTs vary by parser, so can’t be used for this. However, given this is more about the form, what about storing code as lexer tokens ? That should negate formatting differences, which will allow diffs to work and things like git blame to still function. At the same time, each developers editor can be set up to reassemble the lexemes into source code that can be worked with in their preferred form. It could even somewhat help with languages. While variable names wouldn’t be changed, all the literals could be translated into another language to make the code easier to read for non-English speakers.

Has this been done, and how many glaring problems with this am I missing? I looking into storing more processed versions, but found the ambiguity there which would kill that option.