Designing type inference for high quality type errors

(blog.polybdenum.com)

66 points | by PaulHoule 4 days ago

6 comments

johnbender 15 hours ago
Minor nit:
> The job of a compiler error message is to prove to the user that their code is invalid
The job of the compiler error message is to convey why the compiler couldn’t demonstrate the code is correct. The code may be valid but the compiler just can’t prove it (analyses are sound but not complete unless the language is strongly normalizing)
[-]
- PaulHoule 15 hours ago
  Practically the difficulty of generating error messages is one reason why "we can't have nice things". I ran into this problem while trying to write an adventure game (like Zork) in Drools. Drools sucks in a lot of stuff from a Java compiler and mashes it into a production rules framework and the result is I was getting error messages that made no sense at all and thus gave up on the project.
  One approach to compiler development is one of successive transformations and it is possible to cut compilers into very thin slices like the original FORTRAN for the IBM 1401. Similarly you can go beyond the methods taught in Graham's On Lisp to do rather complex metaprogramming in Lisp or some other language, trouble is that an error manifests at stage N+5 but the information required to explain what went wrong was elided at stage N so it is quite difficult to figure out what's going wrong.
- tshaddox 14 hours ago
  Sure, but the goal when designing a compiler and its error messages is to prove to the user that their code in invalid. Most practical type systems are going to be either unsound or incomplete (or both), but the goal should be to make encountering those limitations rare in practice.
  [-]
  - AnimalMuppet 12 hours ago
    It seems to me that, if the limitations are encountered too often in practice, then people stop using that language to write that kind of code, because it's too frustrating to use that way.
  - LoganDark 11 hours ago
    > Sure, but the goal when designing a compiler and its error messages is to prove to the user that their code in invalid.
    The goal when designing a compiler and its error messages should be to give the user the information they need to correct their code.
    [-]
    - tshaddox 11 hours ago
      A distinction without a difference. Perhaps "prove" is sounding too mathematical. I could rephrase it as "the goal...is to demonstrate to the user what is invalid about their code."
      [-]
      - LoganDark 11 hours ago
        > I could rephrase it as "the goal...is to demonstrate to the user what is invalid about their code."
        This is closer, yes.
        [-]
        tshaddox 9 hours ago
        To prove is to demonstrate, and the way to do it is to display information about what is invalid about the code. So all these ways of phrasing it are synonymous.
- layer8 12 hours ago
  One way of formally defining a programming languages is as the set of strings accepted as programs by the compiler, or by the formally defined algorithm that is implemented by the compiler. By such a definition, a given string is not a valid program of the respective programming language if the algorithm rejects it. And it is useful for the error message to show how the input program fails the requirements of the algorithm. The input program may be sound in type-system terms, but still not valid for the respective programming language.
renox 35 minutes ago
I'm not sure I value type inference more than function overloading.. Local type inference yes, global, mmmh..
tayistay 15 hours ago
I've been working on a type checker for a language with ad-hoc overloading and what I did was have the checker proceed iteratively, making passes over the set of constraints and applying deduction rules. So it never guesses, branches, or has to backtrack. If it can't make progress because there's too much overloading, it gives up and asks the user to add some type annotations. I suspect this will actually work quite well in practice even if it can't type check some valid programs.
init1 3 hours ago
All these years and Ada is still the undisputed and unmatched master of type safety.
ruuda 14 hours ago
For reporting the expected type on mismatch, I do something similar in RCL to be able to track the source of the expectation: https://ruudvanasseldonk.com/2024/implementing-a-typechecker...
estebank 14 hours ago
Hot take: the primary role of a compiler is to turn malformed code into human readable diagnostics. They are error reporting tools with a codegen is side-gig.
[-]
- JonChesterfield 11 hours ago
  This can and should be a separate tool, probably called a linter. The compiler dealing with codegen can then just say "nope, that's not a program in this language, try the linter", optionally calling into the linter for you.
  This does of course mean that divergent ideas of "correct" between linter and compiler are confusing and really do need to be avoided for happy developer times.