Nuxt HN | Ask HN: A retrofitted C dialect?

Ask HN: A retrofitted C dialect?

Hi I'm Anqur, a senior software engineer with different backgrounds where development in C was often an important part of my work. E.g.

1) Game: A Chinese/Vietnam game with C/C++ for making server/client, Lua for scripting [1]. 2) Embedded systems: Switch/router with network stack all written in C [2]. 3) (Networked) file system: Ceph FS client, which is a kernel module. [3]

(I left some unnecessary details in links, but are true projects I used to work on.)

Recently, there's a hot topic about Rust and C in kernel and a message [4] just draws my attention, where it talks about the "Rust" experiment in kernel development:

> I'd like to understand what the goal of this Rust "experiment" is: If we want to fix existing issues with memory safety we need to do that for existing code and find ways to retrofit it.

So for many years, I keep thinking about having a new C dialect for retrofitting the problems, but of C itself.

Sometimes big systems and software (e.g. OS, browsers, databases) could be made entirely in different languages like C++, Rust, D, Zig, etc. But typically, like I slightly mentioned above, making a good filesystem client requires one to write kernel modules (i.e. to provide a VFS implementation. I do know FUSE, but I believe it's better if one could use VFS directly), it's not always feasible to switch languages.

And I still love C, for its unique "bare-bone" experience:

1) Just talk to the platform, almost all the platforms speak C. Nothing like Rust's PAL (platform-agnostic layer) is needed. 2) Just talk to other languages, C is the lingua franca (except Go needs no libc by default). Not to mention if I want WebAssembly to talk to Rust, `extern "C"` is need in Rust code. 3) Just a libc, widely available, write my own data structures carefully. Since usually one is writing some critical components of a bigger system in C, it's just okay there are not many choices of existing libraries to use. 4) I don't need an over-generalized generics functionality, use of generics is quite limited.

So unlike a few `unsafe` in a safe Rust, I want something like a few "safe" in an ambient "unsafe" C dialect. But I'm not saying "unsafe" is good or bad, I'm saying that "don't talk about unsafe vs safe", it's C itself, you wouldn't say anything is "safe" or "unsafe" in C.

Actually I'm also an expert on implementing advanced type systems, some of my works include:

1) A row-polymorphic JavaScript dialect [5]. 2) A tiny theorem prover with Lean 4 syntax in less than 1K LOC [6]. 3) A Rust dialect with reuse analysis [7].

Language features like generics, compile-time eval, trait/typeclass, bidirectional typechecking are trivial for me, I successfully implemented them above.

For the retrofitted C, these features initially come to my mind:

1) Code generation directly to C, no LLVM IR, no machine code. 2) Module, like C++20 module, to eliminate use of headers. 3) Compile-time eval, type-level computation, like `malloc(int)` is actually a thing. 4) Tactics-like metaprogramming to generate definitions, acting like type-safe macros. 5) Quantitative types [8] to track the use of resources (pointers, FDs). The typechecker tells the user how to insert `free` in all possible positions, don't do anything like RAII. 6) Limited lifetime checking, but some people tells me lifetime is not needed in such a language.

Any further insights? Shall I kickstart such project? Please I need your ideas very much.

[1]: https://vi.wikipedia.org/wiki/V%C3%B5_L%C3%A2m_Truy%E1%BB%81n_K%E1%BB%B3

[2]: https://e.huawei.com/en/products/optical-access/ma5800

[3]: https://docs.ceph.com/en/reef/cephfs/

[4]: https://lore.kernel.org/rust-for-linux/Z7SwcnUzjZYfuJ4-@infradead.org/

[5]: https://github.com/rowscript/rowscript

[6]: https://github.com/anqurvanillapy/TinyLean

[7]: https://github.com/SchrodingerZhu/reussir-lang

[8]: https://bentnib.org/quantitative-type-theory.html

5 points | by anqurvanillapy 8 hours ago

3 comments

Rochus 1 hour ago
There are approaches with at least partly the same goals as you mentioned, e.g. Zig. Personally I have been working on my own C replacement for some time which meets many of your points (see https://github.com/micron-language/specification); but the syntax is derived from my Oberon+ language, not from C (even if I use C and C++ for decades, I don't think it's a good syntax); it has compile-time execution, inlines and generic modules (no need for macros or a preprocessor); the current version is minimal, but extensions like inheritance, type-bound procedures, Go-like interfaces or the finally clause (for a simple RAII or "deferred" replacement) are already prepared.
[-]
- anqurvanillapy 48 minutes ago
  > There are approaches e.g. Zig.
  Yes! Zig has done a great job on many C-related stuff, e.g. they've already made it possible to cross-compile C/C++ projects with Zig toolchain years ago. But I'm still quite stupidly obsessed with source-level compatibility with C, don't know if it's good, but things like "Zig uses `0xAA` on debugging undefined memory, not C's traditional `0xCC` byte" make me feel Zig is not "bare-bone" enough to the C world.
  > Micron and Oberon+ programming language.
  They look absolutely cool to me! The syntax looks inspired from Lua (`end` marker) and OCaml (`of` keyword), CMIIW. The features are pretty nice too. I would look into the design of generic modules and inheritance more, since I'm not sure what a good extendability feature would look like for the C users.
  Well BTW, I found there's only one following in your GitHub profile and it's Haoran Xu. Any story in here lol? He's just such a genius making a better LuaJIT, a baseline Python JIT and a better Python interepreter all happen in real life.
leecommamichael 7 hours ago
We seem to have the same desire for a “cleaned up C.” Could you say more about how metaprogramming would work? I doubt you want to put lifetimes into the type system to any degree. The reason C compiles so much quicker than C++ is the lack of features. Every feature must be crucial. Modules are crucial to preserving C.
[-]
- anqurvanillapy 3 hours ago
  > We seem to have the same desire for a “cleaned up C.”
  That's so great! But sad that no enough ideas and argument came up here. :'(
  > How metaprogramming would work?
  When it comes to "tactics" in Coq and Lean 4 (i.e. DSL to control the typechecker, e.g. declare a new variable), there are almost equivalent features like "elaborator reflection" in Idris 1/2 [1] (e.g. create some AST nodes and let typechecker check if it's okay), and most importantly, in Scala 3 [2], you could use `summonXXX` APIs to generate new definitions to the compiler (e.g. automatically create an instance for the JSON encoding trait, if all fields of a record type is given).
  So the idea is like: Expose some typechecker APIs to the user, with which one could create well-typed or ready-to-type AST nodes during compile time.
  [1]: https://docs.idris-lang.org/en/latest/elaboratorReflection/e...
  [2]: https://docs.scala-lang.org/scala3/reference/contextual/deri...
  > Lifetime and compilation speed.
  Yes exactly, I was considering features from Featherweight Rust [3], some subset of it might be partially applied. But yes it should be super careful on bringing new features in in case of compilation speed.
  It's also worth to mention that C compiler itself would do some partial "compile-time eval" like constant folding, during optimization. I know some techniques [4] to achieve this during typechecking, not in another isolated pass, and things like incremental compilation and related caching could bring benefits here.
  [3]: https://dl.acm.org/doi/10.1145/3443420
  [4]: https://en.wikipedia.org/wiki/Normalisation_by_evaluation
  > Every feature must be crucial.
  I want to hear more of your ideas on designing such language too, and what's your related context and background for it BTW, for my curiosity?
fithisux 7 hours ago
C3 to C compiler could be a proposal.
[-]
- anqurvanillapy 3 hours ago
  Ah that should be good for source-level compatibility. But I'm thinking about extending existing codebase that crosses between the kernel and user space, e.g. DPDK, SPDK, FUSE, kernel module, etc. Curious that how C3 would be adopted in such projects.
  [-]
  - fithisux 2 hours ago
    Start small.
    [-]
    - anqurvanillapy 2 hours ago
      And then? https://github.com/anqurvanillapy/TinyLean
      [-]
      - fithisux 19 minutes ago
        Very very interesting for me. I always wanted to do something similar for Maude in Golang (Python is not a bad choice).
        Currently my focus is on data engineering, but I can use it as an inspiration.
        I talked about C3 to C translator, this is what I said start small.