C/C++ Embedded Files (2013)

(4rknova.com)

37 points | by ibobev 4 hours ago

9 comments

  • saidnooneever 5 minutes ago
    you could also use the linker to link in basically anything into the file where u like.

    it might be a bit 'arcane' way to do it idk... but to me it always seemed the logical way.. u can also define symbols etc around it and use extern in ur c/cpp program to reference those.to access the data in light of dynamic linking / alsr etc.

    here is some resource on it with some examples: https://wiki.osdev.org/Linker_Scripts

    u can include any file. another executable, images, etc. etc. no need for weird stuff in the c sources?

    on the flipside, is there a benefit of doing it inside the source code?? (apart from not having to roll ur own linker script and learn that dragon?)

  • gavinray 3 hours ago
    Outdated, modern solution is baked in now

    https://en.cppreference.com/w/c/preprocessor/embed

    • indigoabstract 1 hour ago
      That's good to know, but I've noticed it was added in C++26 and seems to be supported in GCC 15 and Clang 19, but not MSVC.

      I think in a few (3-4?) years it will be safe to use, but in any case not now.

      Still, good to know that it exists.

      • gmueckl 1 hour ago
        I would assume that this is easy enough to implement that it will likely appear in a minor update to the upcoming Visual Studio version. MS kept updating the compiler since VS 2022, too.
    • monegator 2 hours ago
      let me know when my embedded target's compiler is C23 compliant (i mean, i whish. we may be getting C11 or even C17 some times next year but i'm not holding my breath)
    • jcalvinowens 2 hours ago
      It will be at least a decade before I can rely on that in systems software that needs to be portable.
    • rolandhvar 3 hours ago
      The thing that always irks me about c++ is this sort of thing:

      > Explanation 1) Searches for the resource identified by h-char-sequence in implementation-defined manner.

      Okay, so now I have to make assumptions that the implementation is reasonable, and won't go and "search" by asking an LLM or accidentally revealing my credit card details to a third party, right?

      And even if the implementation _is_ reasonable the only way I know what "search" means in this context is by looking at an example, and the example says "it's basically a filename".

      So now I think to myself: if I want to remain portable, I'll just write a python script to do a damn substitution to embed my file, which is guaranteed to work under _any_ implementation and I don't have to worry about it as soon as I have my source file.

      Does anyone else feel this way or is it just me?

      • Calavar 2 hours ago
        You're not the only one who feels that way, but IMHO it's not a valid complaint.

        The C++ standard says implementation defined because the weeds get very thick very quickly:

        - Are paths formed with forward slash or backslash?

        - Case sensitive?

        - NT style drive letter or Posix style mounts?

        - For relative paths, what is it relative to? When there are multiple matches, what is the algorithm to determine priority?

        - What about symlinks and hard links?

        - Are http and ftp URIs supported (e.g. an online IDE like godbolt). If so, which versions of those protocols? TLS 1.3+ only? Are you going to accept SHA-1?

        - Should the file read be transactional?

        People already complain that the C++ standard is overly complicated. So instead of adding even more complexity by redefining the OS semantics of your build platform in a language spec, they use "implementation defined" as a shorthand for "your compiler will call fopen" plus some implementation wiggle room like command line options for specifying search paths and the strategy for long paths on Windows

        What if #embed steals my credit card data is a pointless strawman. If a malicious compiler dev wanted to steal your credit card data, they'd just inject the malicious code; not act like a genie, searching the C++ spec with a fine comb for a place where they could execute malicious code while still *technically* being standards conformant. You know that, I know that, we all know that. So why are we wasting words discussing it?

        • gmueckl 1 hour ago
          The real reason why this stuff in underspecified in the spec is that some mainframe operating systems don't have file systems in the common modern sense, but support C++. Those vendors push back a lot against narroed definitions as far as I know.
        • AlotOfReading 2 hours ago
          Including files also opens up some potential security issues that the standards committee just didn't want to prescribe solutions to. Compiler explorer hides easter eggs around the virtual filesystem, for example:

          https://godbolt.org/z/KcqTM5bTr

      • orbital223 2 hours ago
        > So now I think to myself: if I want to remain portable, I'll just write a python script

        How can you know that your Python implementation won't send your credit card details to an LLM when it runs your script? It does not follow an ISO standard that says it can't do that. You're not making assumptions about it's behavior, are you?

      • GabrielTFS 1 hour ago
        #include also searches for the file you give it in an "implementation-defined manner", so if you have this complaint about #embed, you ought to also consider #include equally problematic
      • CamouflagedKiwi 3 hours ago
        This doesn't sound like the kind of portability anyone is really worried about. I get that the docs on the linked site are written in standards-ese and are complicated by macro replacement, but I don't think the outcome of sending your credit card details away is gonna be an outcome. If it was, an uncharitable implementation with access to your card details would be free to do that any time you gave it input invoking undefined behaviour (which is of course not uncommon, especially in incorrect code).
      • david2ndaccount 2 hours ago
        If you want to remain portable, write your code in the intersection of the big 3 - GCC, Clang and MSVC - and you’ll be good enough. Other implementations will either be weird enough that many things you’d expect to work won’t or are forced to copy what those 3 do anyway.
      • duped 22 minutes ago
        this take is basically equivalent to "don't write software unless you write the stack from scratch."
      • MoltenMan 3 hours ago
        ...what? What are you talking about? In what world would a compiler implement a preprocessor directive to ever use an llm, the internet, or your credit card details (from where would it get those)??? There are always implementation defined things in every language, for example, ub behavior. Do you get worried that someone will steal your bitcoin every time you use after free? Of course not! Even in Python when you OOM -- at least in CPython -- you crash with undefined behavior.
        • MoltenMan 3 hours ago
          Sorry for being so aggressive. I suppose I'm just very confused at where you're coming from.
  • CamouflagedKiwi 3 hours ago
    You can also do it using ld - it's something like ld -r --format binary -o out.o <file>, although you do want some build system assistance to generate header files allowing you to access the thing (somewhat similar to the assembly example here). It's a bit of a performance but I strongly prefer it to generating header files in the earlier options - those header files can end up being _very_ large (they generally multiply up the size of the embedded file by 2-4x) and slow to compile.

    All a bit less relevant now since recent C++ versions have this built in by default. Generally something languages have been IMO too slow on (e.g. Go picked this up four or so years ago, after a bunch of less nice home-grown alternatives), it's actually just really useful to make things work in the real world, especially for languages that you can distribute as single-file binaries (which IMO should be all of them, but sadly it's not always).

  • cyco130 1 hour ago
    My very first open source project[1] aimed to solve the same problem. Nice to see it still has quite a few weekly downloads.

    [1] https://sourceforge.net/projects/bin2c/

  • delduca 2 hours ago
    My current workaround until it arrives in all C++ compilers

    ``` inline constexpr auto bootstrap = #include "bootstrap.lua" ;

    // ... later

    lua.script(bootstrap, "@bootstrap"); ```

    The lua code ``` R"( -- your code here )"; ```

  • mgaunard 3 hours ago
    surely the preprocessor method doesn't work in the general case, since the data can contain commas or parentheses.

    Regardless all of the methods suggested are terrible. If you don't have access to #embed, just write a trivial python script.

    • david2ndaccount 2 hours ago
      You can apply `#` to __VA_ARGS__, which won’t preserve the exact whitespace, but for many languages it’s good enough. biggest issue is you can’t have `#` in the text.
    • oguz-ismail2 3 hours ago
      How is `xxd -i' terrible?
      • mgaunard 2 hours ago
        It's still lacking content that goes before/after the output.

        Just write a Python script that does the whole thing.

        • oguz-ismail2 2 hours ago
          Don't know what you mean, it works fine here. Python is too large and unreliable a dependency for something so trivial (which can be accomplished using standard POSIX utilities if need be).
          • astrobe_ 2 hours ago
            Indeed, even writing this utility in C is trivial and has 0 extra dependency for a pure C/C++ project. Avoiding #embed also removes the dependency to a C++23 capable compiler, which might not be available in uncommon scenarios.
          • jcalvinowens 2 hours ago
            Python is pretty much mandatory for Linux systems nowadays, unless you're dealing with something really minimalist or trying to be very portable it's safe to rely on.
            • oguz-ismail2 2 hours ago
              > it's safe to rely on

              Is there any guarantee they won't break backwards compatibility again?

  • borcunozkablan 3 hours ago
    Why don't you #embed?
    • qbow883 2 hours ago
      Because the linked article is from 2013.