Classic mistakes in language design that have to be fixed later.
- "We don't need any attributes", like "const" or "mut". This eventually gets retrofitted, as it was to C, but by then there is too much code without attributes in use. Defaulting to the less restrictive option gives trouble for decades.
- "We don't need a Boolean type". Just use integers. This tends to give trouble if the language has either implicit conversion or type inference. Also, people write "|" instead of "||", and it almost works. C and Python both retrofitted "bool". When the retrofit comes, you find that programs have "True", "true", and "TRUE", all user-defined.
Then there's the whole area around Null, Nil, nil, and Option. Does NULL == NULL? It doesn't in SQL.
That's what's nice about coarse-grained feature options like Rust's editions or Haskell's "languages", you can opt in to better default behavior and retain compatibility with libraries coded to older standards.
The "null vs null" problem is commonly described as a problem with the concept of "null" or optional values; I think of it as a problem with how the language represents "references", whether via pointers or some opaque higher-level concept. Hoare's billion-dollar mistake was disallowing references which are guaranteed to be non-null; i.e. ones that refer to a value which exists.
Indeed it does, by showing how many different and confusing types of parsing rules are used in languages that don't have statement terminators. Needing a parser clever enough to interpret essentially a 2-d code format seems like unnecessary complexity to me, because at its core a programming language is supposed to be a formal, unambiguous notation. Not that I'm against readability; I think having an unambiguous terminating mark makes it easier for humans to read as well. If you want to make a compiler smart enough to help by reading the indentation, that's fine, but don't require it as part of the notation.
Non-statement-based (functional) languages can be excepted, but I still think those are harder to read than statement-based languages.
> I would love to see a language try to implement a rule where only an indented line is considered part of the previous expression.
After python, it seems like every language decided that making parsing depend on indents was a bad idea. A shame, because humans pretty much only go by indents. An example I've frequently run into is where I forget a closing curly brace. The error is reported at the end of the file, and gives me no advice on where to go looking for the typo. The location should be obvious, as it's at exactly the point where the indentation stops matching the braces. But the parser doesn't look at indents at all, so it can't tell me that.
I was much more opposed to this early on than I am now. With modern IDEs and extensions handling tabs vs spaces, tab width, and formatting, python ends up being very easy to read and write. I use it daily, and while I hate it for other reasons, I can't remember the last time I had any issues with indentation.
The issue is that you find you very often want to break those roles. Python basically has `elif` because `else if` would make each branch nest one level deeper which isn't what one wants, except Python uses exceptions for flow control so you find yourself having to use `except ... try` as an analogue to `else if` but not `excetry` exists to do the same and stop the indentation.
There are many other examples. It exists to give people freedom. Also, while humans only go by intendation it's very hand for text editing and manipulation without requiring special per-language support to move the cursor say to the nearest closing brace and so forth.
> An example I've frequently run into is where I forget a closing curly brace. The error is reported at the end of the file, and gives me no advice on where to go looking for the typo. The location should be obvious, as it's at exactly the point where the indentation stops matching the braces. But the parser doesn't look at indents at all, so it can't tell me that.
That's somewhat a quality of service issue though. Compilers should look at where the braces go out of kilter vs indentation and suggest the possible unmatched opening brace.
As a casual observer who has written perhaps a dozen lines of Scala in his life, I feel like Scala approaches any “pick one” decision with “why not both?”.
It's interesting seeing all of the different ways language designers have approached this problem. I have to say that my takeaway is that this seems like a pretty strong argument for explicit end of statements. There is enough complexity inherent in the code, adding more in order to avoid typing a semicolon doesn't seem like a worthwhile tradeoff.
I'm definitely biased by my preferences though, which are that I can always autoformat the code. This leads to a preference for explicit symbols elsewhere, for example I prefer curly brace languages to indentation based languages, for the same reason of being able to fully delegate formatting to the computer. I want to focus on the meaning of the code, not on line wrapping or indentation (but poorly formatted code does hinder understanding the meaning). Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.
Would love to hear from someone who does think this is worthwhile, why do you hate semicolons?
Start from the perspective of the user seeing effectively:
> error: expected the character ';' at this exact location
The user wonders, "if the parser is smart enough to tell me this, why do I need to add it at all?"
The answer to that question "it's annoying to write the code to handle this correctly" is thoroughly lazy and boring. "My parser generator requires the grammar to be LR(1)" is even lazier. Human language doesn't fit into restrictive definitions of syntax, why should language for machines?
> Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.
That's why meaningful whitespace is better than semicolons. It forces you to write the ambiguous cases as readable code.
Are we really saving that much by not having semicolons? IDEs could probably autocomplete this with high success, and it removes ambiguity from weird edge cases. On the other hand, I've not once had to think about where go is putting semicolons...
Those are functional languages that generally don't use statements, so it makes sense to leave them out of a discussion about statement separators. If you think more people should use functional languages and so avoid the semicolon problem altogether, you could argue that.
Functional hardly matters Haskell has plenty of indentation which is by the way interchangeable with `{ ... }`, one can use both at one's own pleasure and it's needed for many things.
Also, famously `do { x ; y ; z }` is just syntactic sugar for `x >> y >> z` in Haskell where `>>` is a normal pure operator.
Because formatters are increasingly popular, I think it'd be interesting to see a language that refuses to compile if the code is improperly formatted, and ships with a more tolerant formatter whose behavior can change from version to version. This way, the language can worry less about backwards compatibility or syntax edge cases, at the cost of taking away flexibility from its users.
So, the question is, if you have a long expression, should you have to worry too much about either adding parentheses, or making sure that your line break occurs inside a pair of parentheses.
It boils down to preference, but a language feature that supports whatever preference you have might be nice.
priority = "URGENT" if hours < 2 else
"HIGH" if hours < 24 else
"MEDIUM" if hours < 72 else
"LOW"
It's not. Your eyes can deceive you by guessing the correct indentation. Indentation should never be used for grammar separation. Explicit characters such as } ] ) are clearer and unambiguous.
Clearer for the computer, but not for the human. Many errors, some severe, have been caused by a human only looking at the indentation and not realizing the braces don't match.
> how does Gleam determine that the expression continues on the second line?
The fact that it isn't obvious means the syntax is bad. Stuff this basic shouldn't be ambiguous.
> Go's lexer inserts a semicolon after the following tokens if they appear just before a newline ... [non-trivial list] ... Simple enough!
Again I beg to differ. Fundamentally it's just really difficult to make a rule that is actually simple, and lets you write code that you'd expect to work.
I think the author's indentation idea is fairly reasonable, though I think indentation sensitivity is pretty error-prone.
- "We don't need any attributes", like "const" or "mut". This eventually gets retrofitted, as it was to C, but by then there is too much code without attributes in use. Defaulting to the less restrictive option gives trouble for decades.
- "We don't need a Boolean type". Just use integers. This tends to give trouble if the language has either implicit conversion or type inference. Also, people write "|" instead of "||", and it almost works. C and Python both retrofitted "bool". When the retrofit comes, you find that programs have "True", "true", and "TRUE", all user-defined.
Then there's the whole area around Null, Nil, nil, and Option. Does NULL == NULL? It doesn't in SQL.
The "null vs null" problem is commonly described as a problem with the concept of "null" or optional values; I think of it as a problem with how the language represents "references", whether via pointers or some opaque higher-level concept. Hoare's billion-dollar mistake was disallowing references which are guaranteed to be non-null; i.e. ones that refer to a value which exists.
Non-statement-based (functional) languages can be excepted, but I still think those are harder to read than statement-based languages.
After python, it seems like every language decided that making parsing depend on indents was a bad idea. A shame, because humans pretty much only go by indents. An example I've frequently run into is where I forget a closing curly brace. The error is reported at the end of the file, and gives me no advice on where to go looking for the typo. The location should be obvious, as it's at exactly the point where the indentation stops matching the braces. But the parser doesn't look at indents at all, so it can't tell me that.
There are many other examples. It exists to give people freedom. Also, while humans only go by intendation it's very hand for text editing and manipulation without requiring special per-language support to move the cursor say to the nearest closing brace and so forth.
That's somewhat a quality of service issue though. Compilers should look at where the braces go out of kilter vs indentation and suggest the possible unmatched opening brace.
You can mix indentation and braces to delimit blocks.
It's insane.
Functional or OO? Yes.
I'm definitely biased by my preferences though, which are that I can always autoformat the code. This leads to a preference for explicit symbols elsewhere, for example I prefer curly brace languages to indentation based languages, for the same reason of being able to fully delegate formatting to the computer. I want to focus on the meaning of the code, not on line wrapping or indentation (but poorly formatted code does hinder understanding the meaning). Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.
Would love to hear from someone who does think this is worthwhile, why do you hate semicolons?
> error: expected the character ';' at this exact location
The user wonders, "if the parser is smart enough to tell me this, why do I need to add it at all?"
The answer to that question "it's annoying to write the code to handle this correctly" is thoroughly lazy and boring. "My parser generator requires the grammar to be LR(1)" is even lazier. Human language doesn't fit into restrictive definitions of syntax, why should language for machines?
> Because code is still read more than it is written it just doesn't seem correct to introduce ambiguity like this.
That's why meaningful whitespace is better than semicolons. It forces you to write the ambiguous cases as readable code.
Elm does this (so maybe Haskell too). For example
Also, famously `do { x ; y ; z }` is just syntactic sugar for `x >> y >> z` in Haskell where `>>` is a normal pure operator.
such as applicative style formatted like this:
Because it's very little extra work.
If you want to know if it's a good syntax, AFAIK it's the only way to do a semicolon-less language that doesn't break all the time.
So, the question is, if you have a long expression, should you have to worry too much about either adding parentheses, or making sure that your line break occurs inside a pair of parentheses.
It boils down to preference, but a language feature that supports whatever preference you have might be nice.
In PowerShell you can do that by explicitly instructing what the next line is actually a continuation of the previous one:
If it ever gets to that point, a refactor is obligatory.
Don't give the human tools to make easy mistakes. Any grammar can be abused, so blame the human for not writing clean code.
Semicolons are just noise. They're absolutely redundant.
Some brackets are necessary, but whitespace/indent languages make it clear there's a lot of redundancy there too.
The goal is to minimise errors and cognitive load. The fewer characters the better.
The fact that it isn't obvious means the syntax is bad. Stuff this basic shouldn't be ambiguous.
> Go's lexer inserts a semicolon after the following tokens if they appear just before a newline ... [non-trivial list] ... Simple enough!
Again I beg to differ. Fundamentally it's just really difficult to make a rule that is actually simple, and lets you write code that you'd expect to work.
I think the author's indentation idea is fairly reasonable, though I think indentation sensitivity is pretty error-prone.