{ "version": "https://jsonfeed.org/version/1.1", "title": "blog", "expired": false, "items": [ { "id": "gen/writing-compilers.html", "url": "gen/writing-compilers.html", "title": "Writing Compilers", "content_html": "\n\n \n \n \n \n \n \n\n Writing Compilers\n \n \n \n \n \n \n \n
\n

Writing Compilers

\n

Date: 2022-01-10T17:48:22-05:00

\n
\n

Table of Contents

\n \n

In his post \u201cRich Programmer Food\u201d, Steve Yegge explains why you should learn compilers:

\n
\n

If you don\u2019t know how compilers work, then you don\u2019t know how computers work. If you\u2019re not 100% sure whether you know how compilers work, then you don\u2019t know how they work.

\n
\n

Throughout this post, Steve has some witticisms and some harsh realizations:

\n
\n

If you don\u2019t take compilers then you run the risk of forever being on the programmer B-list: the kind of eager young architect who becomes a saturnine old architect who spends a career building large systems and being damned proud of it.

\n
\n

While I wouldn\u2019t say the article wholly convinced me to learn about compilers, I do agree that compilation problems are everywhere.

\n

In front-end work, where I started off, there\u2019s a glut of frameworks. React, Angular, Vue, Svelte, with some tooling like webpack, parcel, esbuild, babel and browserify.

\n

What do they all have in common? They\u2019re all compilers. React takes JSX and turns it into HTML and JS.

\n

Angular launched Typescript, which is a language that compiles to Javascript.

\n

Vue has .vue templates, which compile to HTML, CSS, and JS.

\n

Svelte has its own .svelte files.

\n

All the other build tools I mentioned take javascript, minify it, tree-shake it (dead-code elimination) and bundle it for you so it can be served on the internet.

\n

All of these are compiler problems.

\n

My favorite language, Rust, has a great deal of compiler work in the core language and design tradeoffs to make it easy for new people to adopt and experienced people to enjoy.

\n

In Mobile, Kotlin and Swift, as well as many libraries, all reduce to compiler problems \u2013 they end up manipulating ASTs to produce better code, or compile to some bytecode that is executed on the target platform.

\n

Compiler problems really are everywhere.

\n

Here\u2019s another quote from Ras Bodik:

\n
\n

Don\u2019t be a boilerplate programmer. Instead, build tools for users and other programmers. Take historical note of textile and steel industries: do you want to build machines and tools, or do you want to operate those machines?

\n
\n

Got the message? Compilers are really important. Or so I think. But how does one learn compilers? Well, let\u2019s scour the internet.

\n

A Plan of Attack

\n

There\u2019s no shortage of great materials on compilers on the internet, but I want to focus on three resources I\u2019m currently using to learn compilers, since I think they run the gamut: One\u2019s a great starting resource, another is a great medium level resource, and one is extremely hard but rewarding.

\n

The Resources:

\n
    \n
  1. https://keleshev.com/compiling-to-assembly-from-scratch/
  2. \n
  3. https://craftinginterpreters.com/
  4. \n
  5. https://github.com/rui314/chibicc
  6. \n
\n

Compiling to Assembly from Scratch

\n

Compiling to Assembly from Scratch is the first resource I looked at to start my compiler writing journey. It\u2019s a short book on how to write a small ARM32 emitting compiler in Typescript. It does this by parsing using a parser combinator, and then emitting simple ARM32 code using the visitor pattern.

\n

Parser combinators are a technique for parsing that\u2019s been picking up steam recently. Because Typescript has first-class regex, this technique really fits well here.

\n

The book also goes over basic ARM32 instructions, and at the end of this pretty short book (~200 pages), you have a working compiler that turns a subset of javascript into ARM32.

\n

Worth every penny.

\n

Crafting Interpreters

\n

Crafting Interpreters is a great book \u2013 the first half of the book covers writing a tree-walk interpreter in Java, while the second half of the book involves writing a bytecode VM in C for a non-trivial language called \u201cLox\u201d.

\n

Bob Nystrom really knows his stuff \u2013 the prose is clean, and every line of code written is well explained.

\n

That being said, for the first part, I didn\u2019t really want to write any Java (sorry Oracle), so I found a transcription of the first part of the book\u2019s code in Rust https://github.com/jeschkies/lox-rs and used that code as the basis for the first part of the book.

\n

At the end of the first part of the book, I really felt as though I got the hang of the basics of compiler writing.

\n

I still need to go through the second part, but I\u2019m really enjoying it so far!

\n

ChibiCC

\n

This resource is a bit different: It\u2019s a git repo by the creator of mold (the new LLVM linker) to create a C Compiler (CC) in C.

\n

This repository follows the paper \u201cAn Incremental Approach to Compiler Construction\u201d by Ghuloum, http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf which advocates writing a simple compiler step by step and adding functionality little by little.

\n

Rui Ueyama, the author of this repo, writes clean C code for every commit, and each message details adding a new feature to the C compiler. There\u2019s no accompanying instructional material, so you\u2019re on your own to read the diffs and try to ascribe meaning to them, but it is important to read lots of code, and what better way to start than in such a structured format?

\n

Conclusion

\n

After going through a few resources for compiler construction, I\u2019m starting to get a better understanding of compilers, and seeing those kinds of problems crop up all the time. Writing compilers has also motivated me to read more code, and I\u2019m hoping to be able to read code from larger projects to write better code in the future!

\n \n\n", "date_published": "2022-01-10T17:48:22-05:00" }, { "id": "gen/writing-portable-c.html", "url": "gen/writing-portable-c.html", "title": "Writing Portable C", "content_html": "\n\n \n \n \n \n \n \n\n Writing Portable C\n \n \n \n \n \n \n \n \n
\n

Writing Portable C

\n

Date: 2021-11-15T18:06:09-05:00

\n
\n

Table of Contents

\n \n

On my free time I\u2019ve been hacking away on unix-utils, a project that implements some common unix utilities. In order to test how portable C really is for the average programmer, I decided to lean on open source to see how many platforms I could target my C with. I wanted to write the most portable code I could (that meant targeting only POSIX) functions and using a minimalistic stdlib that tried its best to adhere to the standards (musl). This let me write C code for about 50 targets. Let\u2019s go deeper into the background and how that was all possible:

\n

POSIX and SUS

\n

In the 70s, C was made at AT&T. In the 80s, it escaped, becoming more popular, before eventually becoming standardized by ANSI (C89). But what about standards for Operating Systems? Enter POSIX (the Portable Operating System Interface), a standard for operating system interfaces. In 1988, the first version of POSIX was released by the IEEE, which detailed common interfaces, like Signals, pipes, and the C standard library.

\n

The POSIX standards were fairly minimalistic in the 80s, adding some extensions (like real time programming) and thread extensions) in the early 90s before being subsumed by the Austin Group, a committee that designed the Single Unix Specification. The Austin Group has steered the POSIX standards since 1997, creating standards like (POSIX 1997/SUS v2), (POSIX 2001/SUS v3), and (POSIX 2008/SUS v4).

\n

Since 2008, there have been two minor corrections to the POSIX standards (one in 2011 and 2017), but the two most common POSIX standards in use are POSIX 2001 and 2008, which is where we\u2019ll be directing most of our attention to.

\n

POSIX compliance in particular ends up being extremely important, because most Operating Systems have at least some level of POSIX compliance. Linux, the BSDs, Mac OS, and Windows all do to some extent. That means that our C code can target all of them by following the standards, which makes our code more flexible.

\n

This is so important that GCC (GNU\u2019s C Compiler) ended up implementing a flag that checks for strict compliance to the POSIX standard of your choice.

\n

In my Makefile, I have this line, which says to compile my code strictly according to the standard.

\n
CFLAGS = -std=c99 -D_POSIX_C_SOURCE=200809L
\n

Since I wrote the first draft of my utilities on a Mac OS computer with no POSIX compatibility flags, you can imagine there was a lot of breakage. As to why there was so much breakage, well, that requires another history lesson.

\n

GCC vs MUSL

\n

In the 80s, the Free Software Foundation (FSF) wanted to create the ideal \u201cFree\u201d programming environment. To do so, they were going to start from the top-down, by implementing the user space (a C compiler, a shell, the POSIX shell utilities, etc), and then build an OS kernel (GNU Hurd). GNU succeeded at one part of their mission, by providing the most common userspace tools to date (GCC, Bash, and the GNU utils). However, GNU\u2019s kernel lost out to Linux, and the rest is history.

\n

Linux started out only supporting GCC tools for its userspace, but now it can support a wide variety of C standard libraries (libc for short). One of those ends up being Musl, the standard library of this article.

\n

The choice of standard library would end up being entirely inconsequential if not for one detail: Musl supports static linking, and GCC does not.

\n

Sure GCC supports a lot of non-standard extensions, and sure GCC executables are more bloaty than their musl counterparts, but static linking lets us execute our binaries without having installed a libc on the platform.

\n

That means our code can reach even more users!

\n

Much blood has been spilt on static vs dynamic linking, so I will spare you the carnage by simply saying that static linking tends to be more convenient for the end user (they require less dependencies on their side to run the code), which is good for us, the application builders.

\n

What Sacrifices were made?

\n

How do you make static binaries?

\n

Going back to building some unix utilities, I downloaded a musl-gcc compiler, logged into my linux VM and started compiling.

\n

The first issue I ran into was that musl-gcc didn\u2019t compile static binaries.

\n

I added the flag -static to my build, but file and ldd ended up telling me that my binary was still dynamically linked.

\n

I dug through troves of documentation. Eventually, I discovered that it wasn\u2019t enough just to provide the -static flag, because GCC can ignore it. You have to provide another flag, --static as well. Oh, and if that wasn\u2019t enough, that still didn\u2019t compile static binaries. You had to disable pie, or position independent executables with the flag -no-pie as well.

\n

Finally, I had compiled a hello world binary statically. Time to move on!

\n

Don\u2019t name your functions _init

\n

I then tried to compile my utilities. I wanted to decrease duplication so I wrote a header file with a function called _init. This ended up causing a duplicate symbol error (musl defines this function in crti.o first).

\n

Of course, GCC never complained, so I had to rename this function.

\n

getopt_long doesn\u2019t exist

\n

Next up, getopt_long (Get options with long flags) isn\u2019t POSIX standard. Unshocking. POSIX only specifies the normal getopt, which supports short options only. Long options like --file or --color are a GNUism.

\n

I ended up finding a copy of getopt_long online and rewriting my header file includes for my utilities.

\n

Sysctl isn\u2019t standard

\n

Next up, I had a compiler error where my implementation of uptime failed to compile. <sys/sysctl.h> is a Macism, and not part of POSIX. Linux offers it up in <linux/sysctl.h> for convenience, but as its name might indicate, it\u2019s not portable.

\n

Next!

\n

lstat has optional fields

\n

In my implementation of stat, I used the functions major, minor and ctime, none of which are POSIX compliant. They\u2019re useful on mac os, so I can gate them behind an __APPLE__ macro, but that makes the code less succinct. Oh well.

\n

NI_MAXHOST isn\u2019t defined

\n

As an oddity, musl doesn\u2019t define NI_MAXHOST at all. This is useful for dig, which returns the ip address for a given address. I ended up defining it if it wasn\u2019t already defined.

\n

Getting the Toolchains

\n

With all these changes made, our code will now compile for Linux + musl, thankfully. The next problem was actually getting our toolchains.

\n

Luckily, after some googling, I found out about <musl.cc>, a website which releases versions of musl-gcc toolchains.

\n

Now, since I didn\u2019t want to create an undue amount of load onto this website, I created a mirror of it: https://github.com/Takashiidobe/muslcc.

\n

Next, I had to create a github action that would fetch the compiler required, set it up properly, compile all of the binaries, strip the debug information, tar them into one directory, and release them on a push to tags. Phew!

\n

This part turned out to be a lot of guesswork and letting it run, so I\u2019ll leave the final results here:

\n

https://github.com/Takashiidobe/unix-utils/blob/master/.github/workflows/release.yml

\n

And the repo here:

\n

https://github.com/Takashiidobe/unix-utils

\n

In Short

\n

It\u2019s amazing that you can write code that targets so many architectures, and compile to them easily, all for free, with the power of open source (and Microsoft\u2019s wallet, thanks Github Actions).

\n

With this, I was able to build for 52 architectures and release code for them (I ended up adding in support for x86_64 Darwin and arm64 Darwin).

\n

Viva portable code.

\n \n\n", "date_published": "2021-11-15T18:06:09-05:00" }, { "id": "gen/building-rust-binaries-for-different-platforms.html", "url": "gen/building-rust-binaries-for-different-platforms.html", "title": "Building Rust binaries for different platforms", "content_html": "\n\n \n \n \n \n \n \n\n Building Rust binaries for different platforms\n \n \n \n \n \n \n \n \n
\n

Building Rust binaries for different platforms

\n

Date: 2021-11-03T09:24:46-05:00

\n
\n

Table of Contents

\n \n

Rust has great support for cross compilation, with cross, you can install the required c toolchain + linker and cross compile your rust code to a binary that runs on your targeted platform. Sweet!

\n

If you\u2019d like to look at the code and results, it\u2019s in this repo here: https://github.com/Takashiidobe/rust-build-binary-github-actions

\n

Rust library writers use this feature to build and test for other platforms than their own: hyperfine for example builds for 11 different platforms.

\n

The rustc book has a page on targets and tiers of support. Tier 1 supports 8 targets:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
Tier 1
aarch64-unknown-linux-gnu
i686-pc-windows-gnu
i686-unknown-linux-gnu
i686-pc-windows-msvc
x86_64-apple-darwin
x86_64-pc-windows-gnu
x86_64-pc-windows-msvc
x86_64-unknown-linux-gnu
\n

Tier 2 with Host tools supports 21 targets.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
Tier 2
aarch64-apple-darwin
aarch64-pc-windows-msvc
aarch64-unknown-linux-musl
arm-unknown-linux-gnueabi
arm-unknown-linux-gnueabihf
armv7-unknown-linux-gnueabihf
mips-unknown-linux-gnu
mips64-unknown-linux-gnuabi64
mips64el-unknown-linux-gnuabi64
mipsel-unknown-linux-gnuabi
powerpc-unknown-linux-gnu
powerpc64-unknown-linux-gnu
powerpc64le-unknown-linux-gnu
riscv64gc-unknown-linux-gnu
s390x-unknown-linux-gnu
x86_64-unknown-freebsd
x86_64-unknown-illumos
arm-unknown-linux-musleabihf
i686-unknown-linux-musl
x86_64-unknown-linux-musl
x86_64-unknown-netbsd
\n

Let\u2019s try to build a binary for all 29 targets.

\n

A Note on Targets

\n

The Rust RFC for Target support: https://rust-lang.github.io/rfcs/0131-target-specification.html

\n

A target is defined in three or four parts:

\n

$architecture-$vendor-$os-$environment

\n

The environment is optional, so some targets have three parts and some have four.

\n

Let\u2019s take x86_64-apple-darwin for example.

\n
    \n
  • x86_64 is the architecture
  • \n
  • apple is the vendor
  • \n
  • darwin is the os
  • \n
\n

You\u2019ll notice here that there is no $environment. This target assumes the environment, which is most likely to be gnu.

\n

Let\u2019s take one with four parts: i686-pc-windows-msvc.

\n
    \n
  • i686 is the architecture
  • \n
  • pc is the vendor
  • \n
  • windows is the os
  • \n
  • msvc is the environment
  • \n
\n

In this target, the environment is specified as msvc, the microsoft C compiler. This is the most popular compiler for windows, but it need not be: if you look in the same tier 1 table, there\u2019s this target: i686-pc-windows-gnu.

\n

The only thing that\u2019s changed is the environment is now gnu. Windows can use gcc instead of msvc, so building for this target uses the gcc instead of msvc.

\n

Architectures

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ArchitectureNotes
aarch64ARM 64 bit
i686Intel 32 bit
x86_64Intel 64 bit
armARM 32 bit
armv7ARMv7 32 bit
mipsMIPS 32 bit
mips64MIPS 64 bit
mips64elMIPS 64 bit Little Endian
mipselMIPS 32 bit Little Endian
powerpcIBM 32 bit
powerpc64IBM 64 bit
rsicv64gcRISC-V 64 bit
s390xIBM Z 32 bit
\n

Vendors

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
VendorNotes
pcMicrosoft
appleApple
unknownUnknown
\n

Operating Systems

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
Operating SystemNotes
darwinApple\u2019s OS
linuxLinux OS
windowsMicrosoft\u2019s OS
freebsdFreeBSD OS
netbsdNetBSD OS
illumosIllumos OS, a Solaris derivative
\n

Environments

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
EnvironmentNotes
muslMusl C library
gnuGNU\u2019s C library
msvcMicrosoft Visual C library
freebsdFreeBSD\u2019s C library
netbsdNetBSD\u2019s C library
illumosIllumos\u2019 C library
\n

When you go to the releases tab to download a particular binary, you\u2019ll need to know these four things to download a binary that runs on your system.

\n

Now, let\u2019s start building for all these systems.

\n

Building Binaries for ~30 Targets

\n

We\u2019re going to use Github Actions, a task runner on github.com to build our binaries. Our binary is a simple hello world binary.

\n

If you\u2019d just like to look at the github actions file, it\u2019s located here: https://github.com/Takashiidobe/rust-build-binary-github-actions/blob/master/.github/workflows/release.yml

\n

Conceptually, we\u2019d like to do the following:

\n
    \n
  • Set up our target environments.
  • \n
  • Download the C compiler (environment) we need.
  • \n
  • Download a docker image of the OS we require.
  • \n
  • Download the rust toolchain onto docker container.
  • \n
  • Build the binary.
  • \n
  • Optionally strip debug symbols.
  • \n
  • Publish it to the github releases tab.
  • \n
\n

We\u2019ll first start out by defining our github action and setting up the target environments:

\n
name: release\n\nenv:\n  MIN_SUPPORTED_RUST_VERSION: "1.56.0"\n  CICD_INTERMEDIATES_DIR: "_cicd-intermediates"\n\non:\n  push:\n    tags:\n      - '*'\n\njobs:\n  build:\n    name: ${{ matrix.job.target }} (${{ matrix.job.os }})\n    runs-on: ${{ matrix.job.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        job:\n          # Tier 1\n          - { target: aarch64-unknown-linux-gnu      , os: ubuntu-20.04, use-cross: true }\n          - { target: i686-pc-windows-gnu            , os: windows-2019                  }\n          - { target: i686-unknown-linux-gnu         , os: ubuntu-20.04, use-cross: true }\n          - { target: i686-pc-windows-msvc           , os: windows-2019                  }\n          - { target: x86_64-apple-darwin            , os: macos-10.15                   }\n          - { target: x86_64-pc-windows-gnu          , os: windows-2019                  }\n          - { target: x86_64-pcwindows-msvc          , os: windows-2019                  }\n          - { target: x86_64-unknown-linux-gnu       , os: ubuntu-20.04                  }\n          # Tier 2 with Host Tools\n          - { target: aarch64-apple-darwin           , os: macos-11.0                    }\n          - { target: aarch64-pc-windows-msvc        , os: windows-2019                  }\n          - { target: aarch64-unknown-linux-musl     , os: ubuntu-20.04, use-cross: true }\n          - { target: arm-unknown-linux-gnueabi      , os: ubuntu-20.04, use-cross: true }\n          - { target: arm-unknown-linux-gnueabihf    , os: ubuntu-20.04, use-cross: true }\n          - { target: armv7-unknown-linux-gnueabihf  , os: ubuntu-20.04, use-cross: true }\n          - { target: mips-unknown-linux-gnu         , os: ubuntu-20.04, use-cross: true }\n          - { target: mips64-unknown-linux-gnuabi64  , os: ubuntu-20.04, use-cross: true }\n          - { target: mips64el-unknown-linux-gnuabi64, os: ubuntu-20.04, use-cross: true }\n          - { target: mipsel-unknown-linux-gnu       , os: ubuntu-20.04, use-cross: true }\n          - { target: powerpc-unknown-linux-gnu      , os: ubuntu-20.04, use-cross: true }\n          - { target: powerpc64-unknown-linux-gnu    , os: ubuntu-20.04, use-cross: true }\n          - { target: powerpc64le-unknown-linux-gnu  , os: ubuntu-20.04, use-cross: true }\n          - { target: riscv64gc-unknown-linux-gnu    , os: ubuntu-20.04, use-cross: true }\n          - { target: s390x-unknown-linux-gnu        , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-freebsd         , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-illumos         , os: ubuntu-20.04, use-cross: true }\n          - { target: arm-unknown-linux-musleabihf   , os: ubuntu-20.04, use-cross: true }\n          - { target: i686-unknown-linux-musl        , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-linux-musl      , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-netbsd          , os: ubuntu-20.04, use-cross: true }
\n

Checking out our code:

\n
    steps:\n    - name: Checkout source code\n      uses: actions/checkout@v2
\n

Downloading the C compiler

\n

Most of the time, the C compiler we need is already installed, but in some cases it\u2019ll be overriden by another compiler.

\n

We\u2019ll need to download the correct suitable C compiler in that case: (i686-pc-windows-gnu has gcc, but it\u2019s not on the $PATH).

\n
    - name: Install prerequisites\n      shell: bash\n      run: |\n        case ${{ matrix.job.target }} in\n          arm-unknown-linux-*) sudo apt-get -y update ; sudo apt-get -y install gcc-arm-linux-gnueabihf ;;\n          aarch64-unknown-linux-gnu) sudo apt-get -y update ; sudo apt-get -y install gcc-aarch64-linux-gnu ;;\n          i686-pc-windows-gnu) echo "C:\\msys64\\mingw32\\bin" >> $GITHUB_PATH\n        esac
\n

Installing the Rust toolchain

\n
    - name: Install Rust toolchain\n      uses: actions-rs/toolchain@v1\n      with:\n        toolchain: stable\n        target: ${{ matrix.job.target }}\n        override: true\n        profile: minimal # minimal component installation (ie, no documentation)
\n

Building the executable

\n
    - name: Build\n      uses: actions-rs/cargo@v1\n      with:\n        use-cross: ${{ matrix.job.use-cross }}\n        command: build\n        args: --locked --release --target=${{ matrix.job.target }}
\n

Stripping debug information from binary

\n
    - name: Strip debug information from executable\n      id: strip\n      shell: bash\n      run: |\n        # Figure out suffix of binary\n        EXE_suffix=""\n        case ${{ matrix.job.target }} in\n          *-pc-windows-*) EXE_suffix=".exe" ;;\n        esac;\n        # Figure out what strip tool to use if any\n        STRIP="strip"\n        case ${{ matrix.job.target }} in\n          arm-unknown-linux-*) STRIP="arm-linux-gnueabihf-strip" ;;\n          aarch64-pc-*) STRIP="" ;;\n          aarch64-unknown-*) STRIP="" ;;\n          armv7-unknown-*) STRIP="" ;;\n          mips-unknown-*) STRIP="" ;;\n          mips64-unknown-*) STRIP="" ;;\n          mips64el-unknown-*) STRIP="" ;;\n          mipsel-unknown-*) STRIP="" ;;\n          powerpc-unknown-*) STRIP="" ;;\n          powerpc64-unknown-*) STRIP="" ;;\n          powerpc64le-unknown-*) STRIP="" ;;\n          riscv64gc-unknown-*) STRIP="" ;;\n          s390x-unknown-*) STRIP="" ;;\n          x86_64-unknown-freebsd) STRIP="" ;;\n          x86_64-unknown-illumos) STRIP="" ;;\n        esac;\n        # Setup paths\n        BIN_DIR="${{ env.CICD_INTERMEDIATES_DIR }}/stripped-release-bin/"\n        mkdir -p "${BIN_DIR}"\n        BIN_NAME="${{ env.PROJECT_NAME }}${EXE_suffix}"\n        BIN_PATH="${BIN_DIR}/${BIN_NAME}"\n        TRIPLET_NAME="${{ matrix.job.target }}"\n        # Copy the release build binary to the result location\n        cp "target/$TRIPLET_NAME/release/${BIN_NAME}" "${BIN_DIR}"\n        # Also strip if possible\n        if [ -n "${STRIP}" ]; then\n          "${STRIP}" "${BIN_PATH}"\n        fi\n        # Let subsequent steps know where to find the (stripped) bin\n        echo ::set-output name=BIN_PATH::${BIN_PATH}\n        echo ::set-output name=BIN_NAME::${BIN_NAME}
\n

And uploading to Github

\n
    - name: Create tarball\n      id: package\n      shell: bash\n      run: |\n        PKG_suffix=".tar.gz" ; case ${{ matrix.job.target }} in *-pc-windows-*) PKG_suffix=".zip" ;; esac;\n        PKG_BASENAME=${PROJECT_NAME}-v${PROJECT_VERSION}-${{ matrix.job.target }}\n        PKG_NAME=${PKG_BASENAME}${PKG_suffix}\n        echo ::set-output name=PKG_NAME::${PKG_NAME}\n        PKG_STAGING="${{ env.CICD_INTERMEDIATES_DIR }}/package"\n        ARCHIVE_DIR="${PKG_STAGING}/${PKG_BASENAME}/"\n        mkdir -p "${ARCHIVE_DIR}"\n        mkdir -p "${ARCHIVE_DIR}/autocomplete"\n        # Binary\n        cp "${{ steps.strip.outputs.BIN_PATH }}" "$ARCHIVE_DIR"\n        # base compressed package\n        pushd "${PKG_STAGING}/" >/dev/null\n        case ${{ matrix.job.target }} in\n          *-pc-windows-*) 7z -y a "${PKG_NAME}" "${PKG_BASENAME}"/* | tail -2 ;;\n          *) tar czf "${PKG_NAME}" "${PKG_BASENAME}"/* ;;\n        esac;\n        popd >/dev/null\n        # Let subsequent steps know where to find the compressed package\n        echo ::set-output name=PKG_PATH::"${PKG_STAGING}/${PKG_NAME}"\n    - name: "Artifact upload: tarball"\n      uses: actions/upload-artifact@master\n      with:\n        name: ${{ steps.package.outputs.PKG_NAME }}\n        path: ${{ steps.package.outputs.PKG_PATH }}\n\n    - name: Check for release\n      id: is-release\n      shell: bash\n      run: |\n        unset IS_RELEASE ; if [[ $GITHUB_REF =~ ^refs/tags/v[0-9].* ]]; then IS_RELEASE='true' ; fi\n        echo ::set-output name=IS_RELEASE::${IS_RELEASE}\n    - name: Publish archives and packages\n      uses: softprops/action-gh-release@v1\n      if: steps.is-release.outputs.IS_RELEASE\n      with:\n        files: |\n          ${{ steps.package.outputs.PKG_PATH }}\n      env:\n        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 
\n

And after building this github actions file, we find that\u2026 3 targets fail to build.

\n

x86_64-unknown-freebsd, x86_64-unknown-illumos, powerpc-unknown-linux-gnu.

\n

Luckily, the error message that cross provides gives us a clear indication of what to fix. Cross does not provide a proper image, so it gets confused, defaults to the toolchain it\u2019s running on (ubuntu 20.04), and the linker cannot find the proper libraries required. Easy to fix: Add a Cross.toml file to the root of the project with docker images for the particular targets, and build again.

\n
[target.x86_64-unknown-freebsd]\nimage = "svenstaro/cross-x86_64-unknown-freebsd"\n\n[target.powerpc64-unknown-linux-gnu]\nimage = "japaric/powerpc64-unknown-linux-gnu"
\n

You\u2019ll notice that illumos is missing here \u2013 I couldn\u2019t find a suitable docker image to build it on docker hub, so I gave up. If you find one, let me know and i\u2019ll update this article.

\n

Results

\n

Out of the 29 architectures provided in Tier 1 and Tier 2 with host tools, it was easy enough to build a binary for 28 architectures (We only need a solaris/illumos docker image to build for the last one).

\n

That\u2019s pretty good, given that this only took a couple of hours to test out. I hope Rust continues to support this many architectures into the future, and Github Actions keeps being a good platform to make releases for.

\n

If you\u2019d like to take the repo for yourself to build rust binaries on releases for 28 architectures, feel free to clone/fork the repo here: https://github.com/Takashiidobe/rust-build-binary-github-actions

\n \n\n", "date_published": "2021-11-03T09:24:46-05:00" }, { "id": "gen/offline-email.html", "url": "gen/offline-email.html", "title": "Offline e-mail in the terminal", "content_html": "\n\n \n \n \n \n \n \n\n Offline e-mail in the terminal\n \n \n \n \n \n \n \n \n
\n

Offline e-mail in the terminal

\n

Date: 2021-10-27T09:44:59-05:00

\n
\n

Table of Contents

\n \n

I like putting stuff in the terminal. Let\u2019s roll back the clock 30 years and go back to terminal e-mail.

\n

Let\u2019s start with installing an e-mail client.

\n

I like aerc. I also use mac OS, so it can be installed with a simple brew install aerc.

\n

Setting up Aerc

\n

Here\u2019s another guide for that: Text based gmail

\n

I personally use a gmail, so I had to create a gmail app password to provide to aerc.

\n

On first startup, aerc has a startup wizard that helps you set up your account. Nice! Put in your information and enjoy e-mail in the terminal.

\n

My aerc/accounts.conf looks something like this:

\n
[Personal]\nsource        = imap://me@gmail.com:password@imap.gmail.com:993\noutgoing      = smtp+plain://me@gmail.com:$APP_PASSWORD_HERE@smtp.gmail.com:587\nsmtp-starttls = yes\nfrom          = Me <me@gmail.com>\ncopy-to       = Sent
\n

As well, I wanted to change up some of the defaults:

\n

I set my aerc/aerc.conf like so: this sets my pager to bat instead of less -R, and prefers to display the HTML portion of an email first if possible, then falling back to the plain text version.

\n

To be able to read HTML e-mail, I uncommented the line for text/html, and use the html filter that\u2019s provided by aerc. This requires w3m and dante, so I brew installed both:

\n
brew install w3m\nbrew install dante
\n
[viewer]\npager=/usr/local/bin/bat\nalternatives=text/html,text/plain\n\n[filters]\nsubject,~^\\[PATCH=awk -f @SHAREDIR@/filters/hldiff\ntext/html=bash /usr/local/share/aerc/filters/html\ntext/*=awk -f /usr/local/share/aerc/filters/plaintext
\n

Great! Now we\u2019re all set up with aerc.

\n

Offline Support

\n

This is great and all, but try to run aerc without internet connection. It hangs. That\u2019s not acceptable! Let\u2019s fix that.

\n

Drew DeVault, the original author of aerc published a guide on making aerc work offline https://drewdevault.com/2021/05/17/aerc-with-mbsync-postfix.html. We\u2019ll follow this guide a bit, but I use gmail instead of migadu, and ended up using msmtp instead of postfix, so there\u2019ll be a few changes.

\n

Mbsync for reading e-mail offline

\n

Let\u2019s start off installing mbsync. On Mac OS it is listed as its previous name, isync. So run brew install isync to install it.

\n

We\u2019ll then set it up \u2013 the config file is at ~/.mbsyncrc, so create that and fill it with this:

\n
IMAPStore gmail-remote\nHost imap.gmail.com\nAuthMechs LOGIN\nUser you@gmail.com\nPass $APP_PASSWORD_HERE \nSSLType IMAPS\n\nMaildirStore gmail-local\nPath ~/mail/gmail/\nInbox ~/mail/gmail/INBOX\nSubfolders Verbatim\n\nChannel gmail\nFar :gmail-remote:\nNear :gmail-local:\nExpunge Both\nPatterns * !"[Gmail]/All Mail" !"[Gmail]/Important" !"[Gmail]/Starred" !"[Gmail]/Bin"\nSyncState *
\n

If you don\u2019t already have a ~/mail/gmail/INBOX folder, create it with mkdir -p ~/mail/gmail/INBOX.

\n

Now, if you run mbsync gmail, all of your e-mail will be synced to your ~/mail/gmail folder.

\n

Now, we just need aerc to pull locally instead of from gmails servers.

\n

Go back to aerc/accounts.conf, and edit the source under the [Personal] tag to point to maildir://~/mail. This will let aerc read your e-mail locally instead of from gmail\u2019s servers.

\n

As well, set the default to gmail/INBOX to land in your inbox folder on start.

\n
[Personal]\nsource        = maildir://~/mail\noutgoing      = smtp+plain://me@gmail.com:$APP_PASSWORD_HERE@smtp.gmail.com:587\ndefault       = gmail/INBOX\nsmtp-starttls = yes\nfrom          = Me <me@gmail.com>\ncopy-to       = Sent
\n

Turn off your internet and run aerc. Now you can read your e-mail offline! We\u2019ll want to always keep our mailbox in sync, so we\u2019ll want to run mbsync frequently to keep our mailbox in sync.

\n

First, we\u2019ll need a program called chronic, which is provided in moreutils. Download it with brew install moreutils.

\n

Run crontab -e to edit your local crontab, and put this in it.

\n

This will have cron execute mbsync gmail every minute, keeping your mailbox in sync with google\u2019s servers.

\n
MAILTO=""\nPATH=YOUR_PATH_HERE\n* * * * * chronic mbsync gmail
\n

Sending E-mail offline

\n

If you try to send e-mail while offline on aerc currently, the e-mail will never send. What we\u2019d like is some queue where the e-mail is sent immediately if we\u2019re online, otherwise, to save that message in a queue, and send out all messages immediately as we regain connectivity.

\n

We\u2019ll use msmtp for that.

\n

Install it with brew install msmtp.

\n

msmtp\u2019s config file is called ~/.msmtprc. Fill that file with this:

\n
defaults\ntls on\n\naccount gmail\nauth on\nhost smtp.gmail.com\nport 587\nuser me \nfrom me@gmail.com\npassword APP_PASSWORD_HERE\n\naccount default: gmail
\n

Now we can send e-mail from the command line. This isn\u2019t super useful yet, since aerc has this functionality already. Next, we need to implement the queueing capability we discussed. You\u2019ll want to download two bash scripts that do this for us: msmtpq and msmtp-queue. These can be found here: https://github.com/tpn/msmtp/tree/master/scripts/msmtpq. Make them executable and place them somewhere on your path (I chose /usr/local/bin). This implements the queueing that be discussed.

\n

Finally, we\u2019ll have to hook up aerc to use this capability in accounts.conf.

\n
[Personal]\nsource        = maildir://~/mail\noutgoing      = /usr/local/bin/msmtpq\ndefault       = gmail/INBOX\nsmtp-starttls = yes\nfrom          = Me <me@gmail.com>\ncopy-to       = Sent
\n

Finally, we\u2019ll want to be able to execute the queueing functionality of msmtpq every minute as well. Edit your crontab to look like this:

\n
MAILTO=""\nPATH=YOUR_PATH_HERE\n* * * * * chronic mbsync gmail\n* * * * * chronic msmtp-queue -r
\n

And with that, we\u2019re done! We can now read e-mail offline, which syncs every minute when online, and send e-mail offline, which will get queued, and sent as soon as we\u2019re back online again.

\n \n\n", "date_published": "2021-10-27T09:44:59-05:00" }, { "id": "gen/work-offline.html", "url": "gen/work-offline.html", "title": "Work Offline", "content_html": "\n\n \n \n \n \n \n \n\n Work Offline\n \n \n \n \n \n \n \n
\n

Work Offline

\n

Date: 2021-10-07T20:43:45-05:00

\n
\n

Table of Contents

\n \n

In my last post, I discussed the tradeoffs of various languages with regards to software longevity \u2013 I wanted to pick a language to use that would make long lasting software.

\n

In this post, I want to discuss working offline \u2013 why that interleaves with the choice of language to use, and how to use tooling to make that accessible.

\n

Joe Nelson, of PostgREST fame, wrote a series of posts that resonated with me \u2013 starting with \u201cGoing \u2018Write Only\u2019\u201d where he starts off with quoting Joey Hess, a person who \u201cLives in a cabin and programs Haskell on a drastically under-powered netbook\u201d, where he harvests all of his electricity from the sun, and works using a distributed workflow, much like a git workflow.

\n

He then goes on to explain his motivation of going \u201cwrite-only\u201d thusly:

\n
\n

These people\u2019s thoughts are not idle for me. They contain a reproach, a warning that one can be very busy and yet do unproductive things, hamartia. I want to focus on doing the right thing. Actually focus is the wrong word. Focusing my thoughts would imply the same thoughts but sharper, whereas I want to change the way I think.

\n
\n

He then went on to publish more blog posts focused on creating software that lasts:

\n\n

Which are all great reads, and inspired me to write this post about how I work offline, and what I use to make that happen.

\n

Tools of the Trade

\n

Git

\n

It\u2019s a no-brainer to use git for this kind of workflow. You can go offline for weeks at a time, hacking away at your branch, and when you\u2019re back online, merge back to the main branch, fetch what you missed, and go back to hacking away offline. Since you have a copy of all the history of the repo on your hard disk, if you need to look at changes from the past, you can do just that. Git enables this workflow, while other centralized systems require constant connectivity.

\n

Man Pages

\n

Man pages (and info pages) predate the internet, so of course they work well offline. Unfortunately for Mac, Man pages are pretty sparse (basically scavenged off of old BSD manuals), and they aren\u2019t always the most descriptive when it comes to using cli tools, so I tend to use tldr for that case.

\n

To pay it forward, I tend to bundle man pages with utilities that I make, like rvim or simplestats on cargo, even though rust docs are more common in rust land.

\n

Tldr

\n

Ever remember how to use tar? Me neither. Tldr is a man pages complement, offering examples for cli applications just by typing tldr $keyword. I personally really like it, because it\u2019s like having a concise stackoverflow at your fingertips.

\n

ZIM files

\n

When I tell people about working offline, they ask \u201cbut what about X website?\u201d. Well, if you want to look up a question on wikipedia or stackoverflow, you\u2019d surely need online access, right?

\n

That\u2019s where ZIM files come in \u2013 offline archives of whole websites. The kiwix project (sponsored by wikipedia) offers downloads and torrents for ZIM files for sites like wikipedia and stackoverflow, which you can wholesale download over an internet connection, and then search through to your hearts content. So if you ever forget how to reverse a linked list in your favorite language, you can search through stackoverflow to find out :P.

\n\n

You can also download wikipedia\u2019s own ZIM file archiver and archive other sites that you like.

\n

DevDocs

\n

Not all languages/projects have offline documentation, but most of them have documentation on the web. DevDocs is a project that allows you to download and search through that documentation in a convenient way offline. Every time you get back online, you can sync the documentation of the projects you like to follow. Nifty.

\n

E-books

\n

E-Books are really great too, in both PDF and epub format. You can keep a copy of them on your hard disk and search through them too without going to the internet.

\n

Papers

\n

Arxiv is a repository of open access articles in the sciences and maths. You can download papers off there for free, and read them at your leisure. There\u2019s a treasure trove of papers to read!

\n

Rustup + Cargo

\n

Rust has a strong focus on offline work \u2013 cargo allows you to turn docstring comments into HTML documentation with search, and rustup comes with its own offline documentation, by virtue of rustup docs.

\n

Cargo also allows one to force offline mode by adding the --offline command line flag \u2013 this forces cargo to use downloaded crates instead of going to crates.io.

\n

Hardware

\n

Currently, I work off of a Macbook pro 2017, which only has 128GB of hard disk space. If I want to supplant that with extra hard disk space, I have an external SSD that I carry around with me (with ZIM files for offline use). That being said, it is getting a bit old in 2022, and I may replace it in the coming years for a framework laptop, as the company is very devoted to right to repair.

\n \n\n", "date_published": "2021-10-07T20:43:45-05:00" }, { "id": "gen/software-that-lasts-offline.html", "url": "gen/software-that-lasts-offline.html", "title": "Software that lasts Offline", "content_html": "\n\n \n \n \n \n \n \n\n Software that lasts Offline\n \n \n \n \n \n \n \n
\n

Software that lasts Offline

\n

Date: 2021-10-01T23:05:13-05:00

\n
\n

Table of Contents

\n \n

It seems like every day software gets outdated. It\u2019s so hard to build software that lasts for 3 years, let alone 30. Yet the houses we live in have seen much longer lives, with just a bit of refurbishing here and there. Why can\u2019t our software be the same way? _why the lucky stiff said something that resounds with me as a programmer as he left the internet.

\n
\n

To Program anymore was pointless.

\n

My programs would never live as long as the trial.

\n

A computer will never live as long as the trial.

\n

What if Amerika was only written for 32-bit power pc?

\n

Can an unfinished program be reconstructed??

\n

Can I write a program and go, \u201cAh, Well, You get the gist of it.\u201d

\n
\n

Our software doesn\u2019t last as long as the written word. It\u2019s a very defeating thing to think that most of our code doesn\u2019t last that long. I wondered if it had to be this way, or if it was something to do with how we approached software writing. C code has lasted for a few decades, and could last a few more \u2013 there\u2019s lots of COBOL and FORTRAN code in the wild that\u2019s 50 years old. Software that lasts is both high-level and low-level at the same time \u2013 assembly would never last because it\u2019s tied to its platform.

\n

A higher-level language need not be tied to any specific architecture. Yet the need for compatibility with architectures, past, present, and future makes it so the language must only build off of low-level primitives. A contradiction.

\n

I want a language that has offline documentation, that is robust, has wide compatibility in the past, present, and future, has standards, has multiple implementations, is fast, and is easy to develop for. Here\u2019s a short list of the languages I looked at, with some pros and cons for each in making software that lasts.

\n

Javascript

\n

Javascript is well specified (by committee), with multiple implementations, and a strong commitment to backwards compatibility \u2013 but that\u2019s about where the pros end. Even though I coded professionally in Javascript for a few years, things about the language still trip me up \u2013 I sometimes forget to check for nulls, or I get different types than what I\u2019m expecting when I use the standard library. As well, I know those things will never be fixed, because the web is the most popular for programming. Therefore, fundamentally, the language will never be able to smooth out its warts.

\n

Typescript

\n

Typescript smooths out most of the usability issues of Javascript, and gives it static typing, generics, and new constructs (enums, interfaces, types). It compiles to Javascript quickly, and considering how accessible Javascript is, it shares most of its accessibility pros. That being said, Typescript has more lax backwards compatibility requirements, but with its compatibility to Javascript, won\u2019t be able to fix the language\u2019s warts.

\n

WASM

\n

Web Assembly is a new contender for web language of the future TM. It\u2019s a minimalistic language with S-expressions (like lisps) and is meant to be an easy compiler target. Go, C, C++, Rust, and others can compile down to it, targeting the WASM capabilities of the browser. As well, WASI seems like a portable way to run sandboxed applications in the future.

\n

It\u2019s too low-level for productive use, but is an interesting foray into fixing the kludge of the web.

\n

Ruby

\n

Ruby is the most OOP language I can think of \u2013 message passing, everything is an object, and GC pauses ad nauseum. It\u2019s a language with a lot of expressiveness, and a lot of elegance. It has strong C bindings, so it has good interop with system libraries \u2013 and pretty good backwards compatibility.

\n

That being said, it\u2019s slow and clunky to write. The philosophy of expressiveness means that everybody writes ruby code differently, and there\u2019s a huge divide between ruby programmers, who are more restrictive with what functionality they use, and rails programmers, who are more keen on monkey-patching everything they can find for usability reasons. Not to say one side is right, but the language\u2019s stewardship has been on appeasing many camps, and that leads to fragmentation.

\n

Python

\n

Python is also very OOP, but contrary to ruby, it even comes with its own Zen, which you can read by entering import this at the REPL.

\n

Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. There should be one\u2013 and preferably only one \u2013obvious way to do it.

\n

Python prefers fewer ways to do one thing, but that zen has been wearing off, with a huge standard library, which makes it hard to commit to backwards compatibility. Python has some system dependencies that are less than stellar when it comes to backwards compatibility, and the large surface area can make it hard for the language to keep stable.

\n

Oh yeah, and remember Python3?

\n

OCaml

\n

OCaml is an interesting language; it has a bytecode interpreter, cross compilation, and compilation to native, just like Haskell. It\u2019s relatively fast to compile, but has issues with backwards compatibility and footguns. As well, the standard library has been reimplemented by many, including by Jane Street, twice (Base and Core).

\n

It\u2019s a clear language with some baggage (Few people use the \u201cObject Oriented\u201d or \u201cO\u201d features of \u201cOCaml\u201d). For loops and classes are often struck down in code review as anti-patterns. Best to be functional, all the time.

\n

It is relatively fast, with a good ecosystem (Dune makes building OCaml apps pretty nice in 2021) but it\u2019s still a relatively small ecosystem, fighting against Haskell to become Typed Functional Programming\u2019s main language.

\n

Offline Documentation is great, and the standard library has few wants of its environment, but it lacks multicore support \u2013 in an increasingly parallel world, that\u2019s a deal-breaker. It\u2019s looking like a reimplementation of the standard library with async might require a major version bump, to (5.X).

\n

JVM languages (Java, Scala, Clojure)

\n

JVM languages are pretty strong, with the JVM allowing users to target many platforms with their code (since the JVM runs on many things). That being said, the JVM offers some penalty, because the runtimes can be quite hefty and difficult to port. It\u2019s not quite as easy as sending someone a binary and they can run it, or easy to containerize JVM apps, unlike those that offer native binaries.

\n

CLR languages (C#, F#)

\n

Same as the JVM languages, although I have to say that F# and C# are pretty fun to program in.

\n

Go

\n

Go was the first serious language in the list I considered learning \u2013 simple like C, with a strong standard library for the modern era with a focus on async + web programming. Sounds like a dream. Oh, and fast compile times and cross-compilation. Woah. Relatively small native binaries that don\u2019t rely on libc? Doable in Go.

\n

Lots of big projects have been done in go, like most hashicorp stuff, docker, kubernetes, and a wealth of devops/cloud tools. It\u2019s a productive language, and one that nudges you to sane defaults.

\n

But it\u2019s not all sunshine and roses \u2013 the package managing story has been a nightmare, there are no generics (Hello casting to Interface{}) and I don\u2019t understand why a GC\u2019ed language should have pointers and references explicitly? You get a lot for free, just by using Go, but you pay for it with complexity \u2013 I rarely miss generics as an application developer, until I\u2019m slapped with complexity because the library developer decided to hand over the complexity to me. Go also has one standard implementation and stewardship led by Google, which makes it a bit odd \u2013 Google has never been one for backwards compatibility, and it seems like Go might be due for a Go 2.0, which could be an ecosystem and binary break for Go. Only time will tell if Go can survive the blow, or if it\u2019ll go with the route of holding onto the choices of the past.

\n

C++

\n

C++ is the first language on this list with no Garbage Collection, it has a specification that\u2019s ISO standardized, with many committees, and with many implementations. It has functionality for OOP, Functional Programming, Generics, async, with the promise of being as fast as C while staying easier to use.

\n

It mostly capitalizes on that promise. With the advent of modern C++, even though C++ has added many features, it has had a strong promise towards backwards compatibility (it keeps ABI compatibility for a long time, only recently breaking ABI in C++11), and only removing clearly broken functionality (auto_ptr, anyone?) But it can be hard to use \u2013 (the iterator API is one frustrating example), and it can be hard to see the runtime cost of the abstractions you use \u2013 Even though C++ follows the \u201cZero-Cost Abstractions\u201d principle, where you don\u2019t pay for what you don\u2019t use, and what you do use you couldn\u2019t hand-code any better, it breaks down somewhat \u2013 std::map is extremely slow on certain workloads, because it\u2019s just implemented incorrectly \u2013 and Iterators are a good example of an API footgun (remember to always check for .end()!).

\n

The complexity is never really ever paid off \u2013 you have to litter your code with extra keywords like const to the left and right of your functions, along with noexcept, and final, and override. You have to remember what is and what isn\u2019t virtual, and use keywords accordingly, and you have to always generate move constructors, copy constructors, and remember which one is which \u2013 why are there so many ways to initialize an object, and why are there so many things to remember when you write your own class?

\n

Oh, and what\u2019s the difference between struct and class? Who knows?

\n

C++ is a language with lots of promises but it has run into the limits of its promises \u2013 backwards compatibility, ease of use, performance, and expressiveness are all in tension, and C++ is the language you can see that in the most.

\n

C

\n

Meanwhile, C is much more minimalistic than C++. You get nothing \u2013 no expanding arrays, no hashmaps, no trees, no graphs, no async, no unicode, nothing.

\n

It\u2019s a very bare language. That helps it with portability (C is the most portable language on this list by far) but it pays that price by doing almost nothing for you.

\n

It\u2019s a high-level language that makes few choices for you and leaves you in a sandbox of your own creation. That can make code-sharing hard, since it\u2019s bound to its environment \u2013 if you want to make a cross platform library, you have to be careful about which libraries you use, since different OSes have different system libraries.

\n

It has a static type system and is speedy, for sure \u2013 but it can be clunky and unsafe as well.

\n

Rust Every Day (For 3 years)

\n

With all that being said, I\u2019ve decided to pick Rust as my language of choice for the next 3 years for as many programming related tasks as I can. Rust is pleasant to develop for, has pledged backwards compatibility since 2015, targets a wide variety of architectures (thanks mainly to LLVM) and I\u2019m convinced is a language for the rest of my career; it has a great team working on it, with a unique governance structure that makes it resilient to being steered by one interest group.

\n

It\u2019s taken some great ideas from functional programming (Tagged Unions, Sum Types, iterators) while keeping the runtime promises of more imperative languages. It\u2019s a great language to learn for the future, and one that I\u2019m sure will keep on growing, and for that, I\u2019m throwing my weight behind it.

\n

Rust every day. For 3 years. Then I\u2019ll revisit this and see what\u2019s changed, but I\u2019d like to use Rust for the next 10 years, at least.

\n \n\n", "date_published": "2021-10-01T23:05:13-05:00" }, { "id": "gen/implementing-iterators.html", "url": "gen/implementing-iterators.html", "title": "Implementing Iterators", "content_html": "\n\n \n \n \n \n \n \n\n Implementing Iterators\n \n \n \n \n \n \n \n \n
\n

Implementing Iterators

\n

Date: 2021-09-18T21:41:43-05:00

\n
\n

Table of Contents

\n \n

Let\u2019s talk about implementing iterators: a way to visit every item in a collection. We\u2019ll use C as an implementation language because it\u2019s simpler than other languages, and we\u2019ll implement C++\u2019s iterator API. This is the same in most mainstream programming languages, like Rust, C++, Python, Ruby, JavaScript, Java, C#, and PHP, with a few small implementation differences.

\n

The API

\n

The API we\u2019ll create is simple. A int* next(int* it) function that takes an iterator and returns its next element, or a NULL pointer if nothing comes next, and a bool has_next(int* it) that returns true if it has a next item, or false if it does not.

\n

The C++ iterator API needs a few functions that give you an iterator to a collection. These are called begin() and end(), which return a pointer to the first item in the collection, and one past the end of the collection. This is a dangerous API, since if we dereference end() we automatically cause Undefined Behavior, but our APIs become a bit cleaner. Tradeoffs, I guess.

\n

We\u2019ll elide the details of begin in our example and implement it ourselves.

\n

Let\u2019s start by defining our collection: an array of ints from 1 - 5.

\n
int items[] = {1, 2, 3, 4, 5};
\n

Let\u2019s say we want to print them: no need for iterators, of course.

\n
#include <stdio.h>\n\nint items[] = {1, 2, 3, 4, 5};\n\nint main(void) {\n    for (int i = 0; i < 5; i++) \n        printf("%d ", items[i]);  \n}
\n

But we have to initialize and increment a variable and use it as an index to our collection\u2026 We want a clearer way of expressing a loop through all the items in a collection, and that\u2019s where iterators come in.

\n

Let\u2019s start by defining the begin and end iterators.

\n
int* begin = &items[0];\nint* end = &items[5];
\n

Remember, the begin points to the first item of the collection, and end points to one past the end. We can\u2019t dereference end, so keep that in mind.

\n

Now, to create the next() function, we want to take an iterator and move to the next item if we aren\u2019t already at the end iterator.

\n

Let\u2019s do that:

\n
int* next(int* it) {\n    if (it != end) \n        return it + sizeof(int);\n    return NULL;\n}
\n

Since we know that our iterator is an (int) pointer, we want to increment the iterator four bytes (the result of sizeof(int) on my computer). This works, but there\u2019s a shorthand that most C compilers will let you do, called pointer arithmetic. In this case, the compiler knows that this is an int pointer, and so it\u2019s overloaded additions and subtractions to move forward and backwards by the sizeof an int.

\n

We can rewrite the above as:

\n
int* next(int* it) {\n    if (it != end) \n        return ++it;\n    return NULL;\n}
\n

Next, we want to write has_next. has_next should return a bool true if the iterator can be incremented, or false if not. We know that an iterator has a next item if it\u2019s not in the last item in the collection, which is just before the end pointer. Thus, we can define has_next thusly:

\n
int has_next(int* it) {\n    return it != end - 1;\n}
\n

Let\u2019s use our iterators thus far to traverse our collection:

\n
#include <stdio.h>\n\nint items[] = {1, 2, 3, 4, 5};\n\nint* begin = &items[0];\nint* end = &items[5];\n\nint* next(int* it) {\n    if (it != end) \n        return ++it;\n    return NULL;\n}\n\nint has_next(int* it) {\n    return it != end - 1;\n}\n\nint main(void) {\n    puts("Printing forwards");\n    int* it = begin;\n    while (it != end) {\n        printf("%d has next? ", *it);\n        puts(has_next(it) ? "true" : "false");\n        it = next(it);\n    }\n}
\n

This should print out:

\n
Printing forwards\n1 has next? true\n2 has next? true\n3 has next? true\n4 has next? true\n5 has next? false
\n

Why use Iterators?

\n

If this seems like a lot of ceremony for iterating through an array, it is. It\u2019s totally unnecessary. It gives us nothing more powerful than what a raw for loop would give us. But what happens if our collection isn\u2019t linear? What happens if we traverse a sorted map, or a graph?

\n

With a for loop, we must ask the caller to understand how the data structure is implemented. With an iterator, we can provide a definition of next, and has next, and the user can call it without knowing anything about the underlying collection outside of the fact that it is iterable.

\n

This allows us to wrap graphs, trees, hash tables, ranges (finite and infinite), and circular data structures in a friendly API for our users.

\n

As well, language features allow us to reward usage of iterators by making syntax more terse: In C++, Rust, Java, C#, Ruby, Python, and JavaScript, if you implement the iterable API in each language, you can do something along these lines:

\n
for (item in collection)\n  do something to item
\n

And the language takes care of the rest. In C, we can\u2019t do that, but in other languages, the language gives us some reward for doing so for our own types, as our types get to behave like library defined types.

\n

Next Steps

\n

Now that we can implement iterators in C, try giving it a shot in your favorite language and seeing what the iterator protocol is for it. It\u2019s loads of fun, I swear.

\n

I tried it myself in C when writing a resizable array type too:

\n
typedef struct Vector {\n  size_t len;\n  size_t capacity;\n  int *items;\n} Vector;\n\n// Allow the user to set their own alloc/free\nstatic void *(*__vector_malloc)(size_t) = malloc;\nstatic void *(*__vector_realloc)(void *, size_t) = realloc;\nstatic void (*__vector_free)(void *) = free;\n\nvoid vector_set_alloc(void *(malloc)(size_t), void *(realloc)(void *, size_t),\n                      void (*free)(void *)) {\n  __vector_malloc = malloc;\n  __vector_realloc = realloc;\n  __vector_free = free;\n}\n\nVector *vector_new(const size_t len, ...) {\n  Vector *v = __vector_malloc(sizeof(Vector));\n  int capacity = 8;\n  capacity = max(pow(2, ceil(log(len) / log(2))), capacity);\n\n  v->items = __vector_malloc(sizeof(int) * capacity);\n  v->len = len;\n  v->capacity = capacity;\n\n  if (len > 0) {\n    va_list argp;\n    va_start(argp, len);\n\n    for (size_t i = 0; i < len; i++) {\n      v->items[i] = va_arg(argp, int);\n    }\n\n    va_end(argp);\n  }\n\n  return v;\n}\n\nvoid vector_free(Vector *v) {\n  __vector_free(v->items);\n  __vector_free(v);\n}\n\nint vector_get(Vector *v, size_t index) {\n  assert(index >= 0 && index < v->len);\n  return v->items[index];\n}\n\nvoid vector_set(Vector *v, size_t index, int val) {\n  assert(index >= 0 && index < v->len);\n  v->items[index] = val;\n}\n\nint vector_empty(Vector *v) { return v->len == 0; }\n\nvoid vector_push(Vector *v, int val) {\n  if (v->len == v->capacity) {\n    v->capacity *= 2;\n    v->items = __vector_realloc(v->items, sizeof(int) * v->capacity);\n  }\n  v->items[v->len] = val;\n  v->len++;\n}\n\nint *vector_begin(Vector *v) { return &v->items[0]; }\n\nint *vector_end(Vector *v) { return &v->items[v->len]; }\n\nint *vector_next(Vector *v, int *it) {\n  if (it != vector_end(v))\n    return ++it;\n  return NULL;\n}\n\nvoid vector_for_each(Vector *v, int (*fn)(int)) {\n  for (int i = 0; i < v->len; i++) {\n    v->items[i] = (*fn)(v->items[i]);\n  }\n}\n\nint vector_pop(Vector *v) {\n  assert(v->len > 0);\n  int top = v->items[v->len - 1];\n  v->len--;\n  return top;\n}
\n \n\n", "date_published": "2021-09-18T21:41:43-05:00" }, { "id": "gen/desert-island-discs.html", "url": "gen/desert-island-discs.html", "title": "Desert Island Discs", "content_html": "\n\n \n \n \n \n \n \n\n Desert Island Discs\n \n \n \n \n \n \n \n
\n

Desert Island Discs

\n

Date: 2021-09-06T22:47:22-05:00

\n
\n

Table of Contents

\n \n

What would be the eight software programs you would take to a deserted island? To me, I\u2019d need the following: an OS, libc, a C compiler, a shell, a database, a networking stack, unix utils, a lisp interpreter, and an editor. I\u2019ve decided to try my hand at writing these to get better at low-level programming, and learn the stack a bit better. It\u2019s a good challenge, but in the interest of time and sanity I will be putting lots of asterisks next to each disc.

\n

An Operating System

\n

I\u2019m not going to write an OS. Maybe I\u2019ll write some signals and system calls, but that\u2019s about it.

\n

A C Compiler

\n

I\u2019ll write a C Compiler that implements most of C89, that emits assembly of the computer I\u2019m working on.

\n

Libc

\n

A limited subset of libc would be nice. Plauger has a good book on implementing a C89 compliant libc, which I will most likely follow, along with looking at some code from musl.

\n

As well, I\u2019ll be writing some crypto and data structures to help with other tasks down the road.

\n

A shell

\n

Nothing like bash. I\u2019ve decided to write a small shell and relegate the rest to the shell I\u2019m currently working on.

\n

A Database

\n

A simple serializable Key-Value database should suffice. Will need to learn B-Trees, though.

\n

A Networking Stack

\n

I\u2019ll learn how to write an HTTP Client and Server using the sockets library.

\n

Unix Utils

\n

I\u2019ll write some basic unix utils, striving for some partial POSIX compliance.

\n

A Lisp Interpreter

\n

Lisp, being one of the simplest languages, allows me to write a higher level language from C. That makes it a great target for an interpreter to learn from.

\n

A Text Editor

\n

Vim is my text editor of choice; it\u2019d be nice to be able to write an editor with some similar functionality.

\n \n\n", "date_published": "2021-09-06T22:47:22-05:00" }, { "id": "gen/the-expression-problem-and-operations-on-matricies.html", "url": "gen/the-expression-problem-and-operations-on-matricies.html", "title": "The Expression Problem And Operations On Matricies", "content_html": "\n\n \n \n \n \n \n \n\n The Expression Problem And Operations On Matricies\n \n \n \n \n \n \n \n \n
\n

The Expression Problem And Operations On Matricies

\n

Date: 2021-06-22T08:52:01-04:00

\n
\n

Table of Contents

\n \n

I\u2019ve heard the sentiment that technical interviews focus on the wrong things; technical aptitude in data structures and algorithms isn\u2019t a great measure of people\u2019s on the job performance. I agree. That being said, there are some small problems where cursory knowledge in them can help you out.

\n

I\u2019m going to use an example I encountered recently, where I wanted to aggregate my personal finances into monthly, quarterly, and yearly reports, with columns for earnings, spend, and cashflow (earnings - spend).

\n

I downloaded the relevant CSVs and went to work parsing them.

\n

Parsing

\n

The first problem came with cleaning up some transactions that were unnecessary \u2013 banks tend to charge a maintenance fee, but they end up crediting you if you meet certain criteria. Even though this balances out, I didn\u2019t want this to count in my earnings and spend, so I wrote a sed regex to delete these lines. Likewise, I wanted to remove some of my investments that I had made (I don\u2019t consider these to be spending, and I wanted to track these another way). Another sed regex it is. Eventually this became a pain, so I made a bash function to combine the regexes in an array to parse the CSV. You just add regexes and it\u2019ll remove the related transactions. Easy enough.

\n

The problem

\n

If you visualize the problem at hand, you\u2019ll get this matrix.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
MonthlyQuarterlyYearly
EarningsMonthly EarningsQuarterly EarningsYearly Earnings
SpendMonthly SpendQuarterly SpendYearly Spend
CashflowMonthly CashflowQuarterly CashflowYearly Cashflow
\n

We have an (m * n) problem, where if we had a new row or column, we add (m) or (n) more things we need to calculate.

\n

Let\u2019s get solving.

\n

Naive Approach

\n

The Naive approach is the O(m * n) solution, where you create a function that deals with a particular cell of this matrix. Take Monthly Earnings. You would write a function that does the logic for dividing the CSV into months, and then applying the logic for Earnings to it.

\n

You would repeat this eight times, to get 9 different functions.

\n

When calling this logic, you would have a switch-case that would select the logic required.

\n

This isn\u2019t very dry, and very tedious, even with only 9 cells. I wanted something better. This is due to the runtime of matricies.

\n

The Expression problem

\n

The expression problem is a fundamental problem in writing functions that work on types and vice versa.

\n

OOP Languages like Java make it easy to create new types that have operations on data. But let\u2019s say you want to add a new method to all your classes. You now need to add a new method to all of your existing and future classes.

\n

ML type languages use pattern matching, which allows you to easily add new operations to a function. But to apply that operation to all existing types, you have to add a new case to all of the pattern matches.

\n

Let\u2019s say I add a new time period, \u201cbi-yearly\u201d to denote half a year chunks. Well, if we go by the naive case, we\u2019d have to add new cases for bi-yearly + cashflow, bi-yearly + earnings, bi-yearly + spend. That\u2019s 3 new functions for one new time period. Ouch.

\n

Let\u2019s say I want a new category that only counts purchases that are larger than $100, and call these \u201clarge purchases\u201d. If so, I would have to add logic to count for monthly, quarterly, bi-yearly, and yearly time periods. That\u2019s 4 new functions for a new category.

\n

Let\u2019s say I want to add a new dimension. I want to split my purchases for all of the above categories and time periods between my credit card and debit card. That\u2019s 4 * 4, or 16 new functions I\u2019d need to implement.

\n

Big O notation would say this grows linearly with regards to the size of each dimension.

\n

If we have 4 monetary categories and 4 time periods, we have 4 * 4 or 16 functions to implement. If we have 4 monetary categories, 4 time periods, and 2 types of credit cards, we have 4 * 4 * 2, or 32 functions to implement.

\n

Our first proposed matrix is a 2D square, and we\u2019re calculating its area. A square with a length of four and a height of 4 has an area of 16.

\n

Our second proposed matrix is a 3D square (a cube). To calculate the area of a cube, you multiply its length, width, and height. We have a length of 4, a width of 4, and a height of 2, totalling 32.

\n

As we add new fields to our rows and columns, and new dimensions to our matrix, we\u2019ll soon see that this becomes untenable. (What happens if I also want to count my investment accounts, of which I have many? I\u2019d have to add more and more dimensions, and the amount of functions to implement increases quite a lot.

\n

Improving the Naive Approach

\n

In the interest of clean code, I wanted to create small composable functions that would calculate the each category.

\n

The Earnings function would only calculate transactions with a positive amount The Spend function would only calculate transactions with a negative amount The Cashflow function would calculate all transactions.

\n

The Monthly function would divide the CSV into months, and apply a function to each range. The Quarterly function would divide the CSV into quarters, and apply a function to each range. The Yearly function would divide the CSV into years, and apply a function to each range.

\n

But the problem is how to set this up properly.

\n

We need some way to signal to the main function that we want to calculate a row * column pairing (a cell). If we add a new dimension, we don\u2019t want to break previous code.

\n

Using a pair

\n

One way to signal this is to use a pair of enums that are captured in a pair as (row, col). This works well enough if we stick to two dimensions. If we add a 3rd dimension, though, this will create incorrect code. If we\u2019re strict on requiring a pair, then we can\u2019t add the 3rd dimension at all without breaking all of our existing code. If we\u2019re looser (allow any tuple, and unpack the first, second and third values) this will work, but our code will be a bit confusing in order to maintain backwards compatibility (some code will check the first and second fields of a 3-tuple, even though it should be checking all three).

\n

Using a flag

\n

Another way to signal this is to use a Flag enum. A flag enum is an enum that has values corresponding to powers of 2.

\n

For example:

\n
typedef enum Color {\n  RED = 1,\n  GREEN = 2,\n  BLUE = 4\n} Color;
\n

(Some people prefer to write it like this, to be explicit that this is a flag):

\n
typedef enum Color {\n  RED = 1 << 0, // 1 \n  GREEN = 1 << 1, // 2\n  BLUE = 1 << 2, // 4\n} Color;
\n

This has the nice property that we can use bitwise or to denote more than one state, and bitwise and to check if the enum is one or more or a particular state.

\n
Color color = RED | GREEN; // this color is Red and Green\nColor Black = RED | GREEN | BLUE; // this color, black, is all the colors\n\nif (color & RED) {\n  // this color has red, do some logic in the red case\n}\nif (color & GREEN) {\n  // this color has green, do some logic in the green case\n}\nif (color & BLUE) {\n  // this color has blue, do some logic in the blue case\n}
\n

In C, since enums are stored as an unsigned int, you can store up to 32 fields (rows + columns + dimensions, in our case). Sometimes this isn\u2019t enough, and you\u2019ll have to find another way, but in our case, it works fine.

\n

The final implementation

\n

Finally to solve the problem, we want to provide a flag enum, and get the CSV that we\u2019re applying it to. First, we want to slice the CSV into the range provided, and then apply some category (Earning, Spend, Cashflow) to it, and then save the CSV.

\n

That can be done something like this:

\n
typedef enum Categories {\n  MONTHLY = 1 << 0,\n  QUARTERLY = 1 << 1,\n  YEARLY = 1 << 2,\n  EARNINGS = 1 << 3,\n  SPEND = 1 << 4,\n  CASHFLOW = 1 << 5,\n} Categories;\n\nvoid generateCsvs(Category category) {\n  if (category & MONTHLY) {\n    doMonthlyLogic();\n  }\n  if (category & QUARTERLY) {\n    doQuarterlyLogic();\n  }\n  if (category & YEARLY) {\n    doYearlyLogic();\n  }\n  if (category & EARNINGS) {\n    doEarningsLogic();\n  }\n  if (category & SPEND) {\n    doSpendLogic();\n  }\n  if (category & CASHFLOW) {\n    doCashflowLogic();\n  }\n}
\n

We can generate our CSVs just like that. Nice. If we add a new dimension, like Credit card vs debit card, all we have to do is add it to our enum and our main function. This only adds two new enums and two new cases. We\u2019ve gone from adding (m * n) functions for our logic to just (m + n). Big O strikes again.

\n
typedef enum Categories {\n  MONTHLY = 1 << 0,\n  QUARTERLY = 1 << 1,\n  YEARLY = 1 << 2,\n  EARNINGS = 1 << 3,\n  SPEND = 1 << 4,\n  CASHFLOW = 1 << 5,\n  CREDIT = 1 << 6,\n  DEBIT = 1 << 7,\n} Categories;\n\nvoid generateCsvs(Categories category) {\n  if (category & CREDIT) {\n    doCreditLogic();\n  }\n  if (category & DEBIT) {\n    doDebitLogic();\n  }\n  if (category & MONTHLY) {\n    doMonthlyLogic();\n  }\n  if (category & QUARTERLY) {\n    doQuarterlyLogic();\n  }\n  if (category & YEARLY) {\n    doYearlyLogic();\n  }\n  if (category & EARNINGS) {\n    doEarningsLogic();\n  }\n  if (category & SPEND) {\n    doSpendLogic();\n  }\n  if (category & CASHFLOW) {\n    doCashflowLogic();\n  }\n}
\n

If we wanted more than 31 categories, we could still do that using a struct instead of an enum. This creates a struct where every member is a boolean flag, and we\u2019re checking if it\u2019s set in our generateCsvs code.

\n
typedef struct Categories {\n  int MONTHLY : 1;\n  int QUARTERLY : 1; \n  int YEARLY : 1; \n  int EARNINGS : 1;\n  int SPEND : 1; \n  int CASHFLOW : 1; \n  int CREDIT : 1; \n  int DEBIT : 1; \n} Categories;\n\nCategories category = { 1, 0, 0, 1 }; // MONTHLY and EARNINGS are set, everything else is zero-initialized.\n\n// or this:\nCategories category = {};\ncategory.MONTHLY = 1; // set MONTHLY;\ncategory.EARNINGS = 1; // set EARNINGS; \n\nvoid generateCsvs(Categories category) {\n  if (category.MONTHLY) {\n    doMonthlyLogic();\n  }\n  // etc.\n}
\n

A note on Associativity

\n

But wait, there\u2019s something we can improve upon in our solution:

\n

You might\u2019ve noticed that we\u2019re coupling our code based on time. Since we\u2019ve decided to cut up the CSV first by time and then calculate the monetary category, you\u2019ve noticed that if we flip the order of them, we might get a different result. This is bad, because refactoring tends to reorder things, and code that is coupled throughout time tends to lead to messier code.

\n

To improve this, we need to add a few restrictions.

\n

But first, a review on associativity and composition.

\n

Associativity means that the order a function is applied in doesn\u2019t matter.

\n

Let\u2019s take the multiplication function. You\u2019ll notice that we can apply them in any order and the function is still correct.

\n
\n

4 * 3 * 2 == (4 * 3) * 2 == 4 * (3 * 2)

\n
\n

Whereas division is not associative, because:

\n
\n

(12 / 2) / 3 != 12 / (2 / 3).

\n
\n

What we did above was like division, where we must apply the functions in some order, so they are coupled in time (the parentheses denote this). What we really want is a mulitiplicative (associative) function, because no matter how many changes we make to the code, it will only grow in complexity linearly, not polynomially.

\n

Thus, if we guarantee that our operations are associative, then we don\u2019t have to worry about how we lay out our main function at all.

\n

To do this, we\u2019ll have to write our functions in a way that they take a CSV and return a CSV after doing some to work them. These CSVs must work on every other CSV that any other step in the main function can produce. So, we\u2019ll change our main function to be like this, where every function takes the CSV and returns a CSV.

\n

We\u2019ll use the flag enum to make sure that we\u2019re applying just the functions that we want.

\n
void generateCsvs(Categories category) {\n  csv = {}; \n  // assume the CSV is an array\n  if (category & CREDIT) {\n    csv = doCreditLogic(csv);\n  }\n  if (category & DEBIT) {\n    csv = doDebitLogic(csv);\n  }\n  // etc\n  writeToCsv(csv);\n}\n// this is the same function \nvoid generateCsvs(Categories category) {\n  csv = {}; \n  // We've flipped the order, but it still works \n  if (category & DEBIT) {\n    csv = doDebitLogic(csv);\n  }\n  if (category & CREDIT) {\n    csv = doCreditLogic(csv);\n  }\n  // etc\n  writeToCsv(csv);\n}
\n

Conclusion

\n

We\u2019ve seen how we can use flag enums, combined with some logic, to cut down the amount of functions we have to write in order to calculate the cell of a matrix. While I won\u2019t agree big tech interviews are the best way to assess candidates, sometimes these problems crop up, and people have been grappling with them for a long time (like the expression problem).

\n \n\n", "date_published": "2021-06-22T08:52:01-04:00" }, { "id": "gen/write-rfcs.html", "url": "gen/write-rfcs.html", "title": "Write RFCs", "content_html": "\n\n \n \n \n \n \n \n\n Write RFCs\n \n \n \n \n \n \n \n
\n

Write RFCs

\n

Date: 2021-06-03T21:00:10-04:00

\n
\n

Table of Contents

\n \n

RFCs are Requests for Comments, popularized by the Internet Engineering Task Force (IETF) which develops and promotes standards for the internet.

\n

Defining an RFC

\n

Much has been said about RFCs (Including RFC 3 which outlines how to write an RFC for the IETF), but let\u2019s read through the main points of RFC 3.

\n
    \n
  • RFCs can be on thoughts, suggestions, etc. relating to the subject (The internet in this case)
  • \n
  • RFCs should be timely, rather than polished
  • \n
  • RFCs do not require examples
  • \n
  • RFCs can be as short or as long as needed
  • \n
\n

According to RFC 3, RFCs should have the following information:

\n
    \n
  1. \u201cNetwork Working Group Request for Comments:\u201d X (where X is the number of the RFC)
  2. \n
  3. Author and Affiliation
  4. \n
  5. Date
  6. \n
  7. Title (Does not need to be unique)
  8. \n
\n

Benefits of RFCs

\n

According to 6 Lessons I learned while implementing technical RFCs as a decision making tool, After implementing RFCs in his organization, Juan Pablo Buritic\u00e1 picked RFCs for the following reasons:

\n
    \n
  • enable individual contributors to make decisions for systems they\u2019re responsible for
  • \n
  • allow domain experts to have input in decisions when they\u2019re not directly involved in building a particular system
  • \n
  • manage the risk of decisions made
  • \n
  • include team members without it becoming design by committee
  • \n
  • have a snapshot of context for the future
  • \n
  • be asynchronous
  • \n
  • work on multiple projects in parallel
  • \n
\n

In my company, RFCs allow us to propose ideas to improve processes, development experience, the product, with low risk of retribution and without the anxiety of giving a presentation. Q & A is relaxed for both the author of the RFC and those writing comments, as they are allowed to proceed asynchronously, without either party feeling pressured to have all the answers right away. They\u2019re also a written record of the thoughts of the authors and reviewers throughout the lifecycle of the proposal, and serve as a historical artifact for reflection (if a proposal that sounded good didn\u2019t turn out so well, why didn\u2019t it work out?)

\n

Implementing RFCs

\n

Oxide Computer explained how they do RFCs (which they call RFDs, Requests for Discussions) here: RFDs at Oxide Computer

\n

At Oxide, RFDs are appropriate for the following cases:

\n
    \n
  • Add or change a company process
  • \n
  • An architectural or design decision for hardware or software
  • \n
  • Change to an API or command-line tool used by customers
  • \n
  • Change to an internal API or tool
  • \n
  • Change to an internal process
  • \n
  • A design for testing
  • \n
\n

Oxide has a few twists, like adding a state as metadata (an RFD can be in the Prediscussion, Ideation, Discussion, Published, Committed, or Abandoned state), and go into detail about integrating their RFD system into git.

\n

A Template for RFCs

\n

Following what Oxide did, I made a template repository for RFCs.

\n

You can find it here: Template RFC Repository

\n \n\n", "date_published": "2021-06-03T21:00:10-04:00" }, { "id": "gen/learning-recursion.html", "url": "gen/learning-recursion.html", "title": "Learning Recursion", "content_html": "\n\n \n \n \n \n \n \n\n Learning Recursion\n \n \n \n \n \n \n \n \n
\n

Learning Recursion

\n

Date: 2021-06-03T11:30:55-04:00

\n
\n

Table of Contents

\n \n

It\u2019s been said that the only way to learn recursion is to learn recursion. So let\u2019s get started!

\n

Recursion is defined by the repeated application of a procedure. There are three distinct parts to creating a recursive function:

\n
    \n
  1. A terminating base case
  2. \n
  3. Continuing the recursion
  4. \n
  5. Making progress towards the base case
  6. \n
\n

Let\u2019s look at all of them while applying them to a problem:

\n
\n

Given an array of integers, return the sum of their values.

\n
\n

A terminating base case

\n

We need a terminating base case, because otherwise a recursive function will continue forever.

\n

We\u2019ll start backwards (trying to find the case that terminates the algorithm) and work our way from there.

\n

Let\u2019s say we have an empty array: well, that would look like this: If the array is empty, then it makes sense that its sum is 0.

\n

Let\u2019s start writing some code to express that:

\n
int sum_empty_arr(int arr*, size_t len) {\n  return 0;\n}\n\nint sum(int arr*, size_t len) {\n  if (len == 0) {\n    return sum_empty_arr(arr, len);\n  } else {\n    // In the next section! \n  }\n}
\n

And hey, that\u2019s the only base case.

\n

Continuing the Recursion

\n

To continue the recursion, let\u2019s continue thinking: if we have a one item array, what do we do?

\n

A one item array\u2019s sum can be expressed like this:

\n

arr[0] + 0.

\n

Let\u2019s take {1, 2} as our array. Well, the sum of the array {1, 2} can be expressed like this:

\n

arr[0] + sum({2})

\n

Similarly, if we have a 3 item array like {1, 2, 3}, the sum of the array can be expressed like this:

\n

arr[0] + sum({2, 3})

\n

This formula works on arrays with any length.

\n

Our key insight here is to take the first item of the array, and sum it with the result of the sum of the rest of the items in the array. We\u2019ve found a way to continue the recursion.

\n

Next, we\u2019ll have to think about how to make progress towards the base case.

\n

Making progress towards the base case

\n

In our formula, we\u2019ve found a way to continue the recursion. But are we making progress towards the base case?

\n

Our base case will terminate when it is provided an array with a length of 0.

\n

In every recursive call, we reduce the length of the array we provide to sum by 1. Thus, as long as our array length is positive (which we can safely assume), we\u2019ll make progress towards the base case. Nice! We won\u2019t recurse forever.

\n

Implementation

\n

We can turn our idea into code like so (I\u2019m doing a bit of pointer arithmetic to move the pointer past the first item and decrementing the length before calling sum again).

\n
int sum(int *arr, size_t len) {\n    if (len == 0) {\n        return sum_empty_arr(arr, len);\n    }  else {\n        int head = arr[0];\n        len--; // decrement the length\n        arr++; // move past the first item\n        return head + sum(arr, len);\n    }\n}
\n

And hey, if we run it:

\n
int main(void) {\n    int arr[] = {1,2,3,4};\n    int total = sum(arr, 4);\n\n    printf("%d\\n", total); // 10\n}
\n

And we get the correct result.

\n

We can clean this code up a bit:

\n
int sum(int *arr, size_t len) {\n    if (len == 0) {\n        return 0;\n    }  else {\n        return arr[0] + sum(++arr, --len);\n    }\n}
\n

Interestingly enough, even though python has a slicing operator, the implementation is similar in length:

\n
def list_sum(arr):\n  if len(arr) == 0:\n    return 0\n  else:\n    return arr[0] + list_sum(arr[1:])
\n

Similarly, in a language like OCaml, where recursion is the idiomatic way to express algorithms:

\n
let rec sum = function\n  | [] -> 0 (* if the array is empty, return 0 *)\n  | h::t -> h + (sum t) (* otherwise, return the value of the head + sum of the rest of the elements. *)
\n

Let\u2019s try another problem:

\n
\n

Given a binary tree, calculate the sum of the values of all nodes in the binary tree.

\n
\n

Let\u2019s go through the steps again.

\n

Finding a base case

\n

Let\u2019s ask ourselves what the possible cases are:

\n

If there is no node, because the node is null, clearly it shouldn\u2019t count. Much like in the empty array case, let\u2019s return 0.

\n

If there is a node, let\u2019s return its value.

\n

Great, we\u2019ve got all our base cases covered. Let\u2019s express them before we continue on:

\n
int sum(TreeNode *node) {\n  if (node == NULL)\n    return 0;\n  else\n    return node->val;\n}
\n

But how do we continue the recursion?

\n

Continuing the Recursion

\n

To continue the recursion, we can apply the function we\u2019ve created to its left and right node. But how? Well, thinking back to the previous problem, the sum of an array is the sum of the current value (the head) + the rest of the items in the array. Likewise, for a binary tree, we need to find the sum of the left items and the sum of the right items.

\n

Since we know that a null node can\u2019t point to anything, we can leave that case be, and express the sum of the binary tree as its current value + the sum of its left child + the sum of its right child.

\n

Let\u2019s do that:

\n
int sum(TreeNode *node) {\n  if (node == NULL)\n    return 0;\n  else\n    return node->val + sum(node->left) + sum(node->right);\n}
\n

Making Progress

\n

Are we making progress? We must be: for every node, we move onto its child nodes. Child nodes (hopefully) eventually return null, in the case of a finite binary tree (of course, we can\u2019t calculate the sum of an infinitely large binary tree).

\n

We did it! We can take this same idea and apply it to linked lists and graphs as well. That\u2019ll be an exercise for the reader, but the idea is very similar.

\n

Appendix

\n

Full code to sum of the nodes of a binary tree:

\n

In C:

\n
#include <stdio.h>\n\ntypedef struct TreeNode {\n  int val;\n  struct TreeNode *left;\n  struct TreeNode *right;\n} TreeNode;\n\nint sum(TreeNode *node) {\n  if (node == NULL)\n    return 0;\n  else\n    return node->val + sum(node->left) + solve(node->right);\n}
\n

In Java:

\n
class Solution {\n  record TreeNode(int val, TreeNode left, TreeNode right) {}\n\n  public int sum(TreeNode node) {\n    if (node == null) {\n      return 0;\n    } else {\n      return node.val + sum(node.left) + sum(node.right);\n    }\n  }\n}
\n

In OCaml:

\n
type 'a tree = \n  | Node of 'a tree * 'a * 'a tree\n  | Leaf;;\n\nlet rec fold_tree f a t = \n    match t with\n      | Leaf -> a\n      | Node (l, x, r) -> f x (fold_tree f a l) (fold_tree f a r);;
\n \n\n", "date_published": "2021-06-03T11:30:55-04:00" }, { "id": "gen/pack-enums-in-c.html", "url": "gen/pack-enums-in-c.html", "title": "Pack Enums in C", "content_html": "\n\n \n \n \n \n \n \n\n Pack Enums in C\n \n \n \n \n \n \n \n \n
\n

Pack Enums in C

\n

Date: 2021-05-29T00:16:07-04:00

\n
\n

What\u2019s the default size of enums in C? Well it\u2019s int. We can see that by using sizeof in an example program:

\n
#include <stdio.h>\n\ntypedef enum Example {\n    One = 1,\n    Two = 2,\n} Example;\n\nint main(void) {\n    printf("The size of Example is: %d\\n", sizeof(Example));\n}
\n

Which prints out 4.

\n

This is noted by the standard: enums should be able to hold 4-bytes.

\n

But 4 bytes seems wasteful, especially in this case: if we have an enum with two choices, we only need a bit to represent it.

\n

In GCC and Clang, there\u2019s support for __attribute__ modifiers. Let\u2019s use the ((__packed__)) modifier to tell the compiler to use the smallest type it can for our enum.

\n
#include <stdio.h>\n\ntypedef enum Example {\n    One = 1,\n    Two = 2,\n} __attribute__ ((__packed__)) Example;\n\nint main(void) {\n    printf("The size of Example is: %d\\n", sizeof(Example));\n}
\n

Which prints out 1. (our enum is now represented by an unsigned char).

\n

If you want to make enums more dense in memory, this is the way to go.

\n \n\n", "date_published": "2021-05-29T00:16:07-04:00" }, { "id": "gen/unix-environment-variables.html", "url": "gen/unix-environment-variables.html", "title": "Unix Environment Variables", "content_html": "\n\n \n \n \n \n \n \n\n Unix Environment Variables\n \n \n \n \n \n \n \n \n
\n

Unix Environment Variables

\n

Date: 2021-05-24T18:58:40-04:00

\n
\n

Let\u2019s talk about some popular unix environment variables:

\n
    \n
  • $USER - The current user
  • \n
  • $PAGER - the program that accepts page by page input. less and more are good examples.
  • \n
  • $VISUAL - A full screen editor (like vi, emacs, and nano).
  • \n
  • $EDITOR - A line by line editor (ed or ex work).
  • \n
  • $PWD - the current working directory
  • \n
  • $HOME - the home directory
  • \n
  • $LANG - the language you use, with an optional encoding.
  • \n
  • $MANPATH - the list of directories to search for manual pages.
  • \n
  • $MAIL - where mail goes
  • \n
  • $SHELL - path to shell binary you use (e.g.\u00a0/bin/bash, /bin/ksh, /bin/sh, /bin/zsh)
  • \n
\n

The most important one is probably $PATH, which is where the OS looks for binaries. It goes from the beginning to the end, executing the first binary it finds.

\n

Let\u2019s say my $PATH is like this:

\n
/usr/local/bin:/usr/bin
\n

Which instructs the OS to look into /usr/local/bin to find a valid binary. Then /usr/bin.

\n

The /bin directory contains binaries for sysadmins and users, but are required when there\u2019s no filesystem in use.

\n

The /usr/bin and was meant to contain executable programs that were part of the OS

\n

and /usr/local/bin is for software that the user installs.

\n

There directories where superuser binaries should be located which follow the same scheme:

\n
    \n
  • /sbin
  • \n
  • /usr/sbin
  • \n
  • /usr/local/sbin
  • \n
\n

As well, /usr/share/bin is for binaries used for web servers and clients.

\n

If you find that a command doesn\u2019t work, double-check to make sure that your $PATH is set up properly to find the correct binary.

\n \n\n", "date_published": "2021-05-24T18:58:40-04:00" }, { "id": "gen/the-central-limit-theorem.html", "url": "gen/the-central-limit-theorem.html", "title": "The Central Limit Theorem", "content_html": "\n\n \n \n \n \n \n \n\n The Central Limit Theorem\n \n \n \n \n \n \n \n \n
\n

The Central Limit Theorem

\n

Date: 2021-05-19T19:55:46-04:00

\n
\n

The central limit theory (CLT) states that when independent random variables are added, their properly normalized sum tends toward a normal distribution (a bell curve) even if the original variables themselves are not normally distributed. ~ Wikipedia

\n

One constraint is that we must have finite variance, as that lowers the amount of outliers we see.

\n

Let\u2019s roll a dice 1000 times and see what that gets us:

\n
from collections import Counter\nimport random\n\ndef r():\n    return random.randrange(1, 7)\n\n\nrolls = Counter()\n\nfor _ in range(1000):\n    rolls[r()] += 1\n\nsorted_dict = {k: rolls[k] for k in sorted(rolls)}
\n

Running it I get this:

\n
{1: 156, 2: 168, 3: 192, 4: 143, 5: 170, 6: 171}
\n
    \n
  • 1 was rolled 156 times
  • \n
  • 2 was rolled 168 times
  • \n
  • 3 was rolled 192 times
  • \n
  • 4 was rolled 143 times
  • \n
  • 5 was rolled 170 times
  • \n
  • 6 was rolled 171 times
  • \n
\n
\n\"Output\"
Output
\n
\n

You\u2019ll see that there\u2019s some variance, but we get a good enough result.

\n

To test the central limit theorem, let\u2019s try to roll two dice at the same time 1000 times and plot it.

\n

Change this line to this:

\n
for _ in range(1000):\n    rolls[r() + r()] += 1
\n

Here were the rolls:

\n
{2: 22, 3: 59, 4: 81, 5: 110, 6: 149, 7: 174, 8: 144, 9: 101, 10: 77, 11: 53, 12: 30}
\n

And here\u2019s the distribution:

\n
\n\"Two
Two Dice Rolls
\n
\n

You\u2019ll notice it\u2019s starting to converge on 6 and 7, but 2 and 12 were fairly unlikely.

\n

We\u2019re starting to get a normal distribution!

\n

Let\u2019s do 5 dice rolls.

\n

Change the line below:

\n
for _ in range(1000):\n    rolls[r() + r() + r() + r() + r()] += 1
\n

Here\u2019s the outcome:

\n
{7: 2, 8: 8, 9: 8, 10: 14, 11: 33, 12: 35, 13: 54, 14: 72, 15: 61, 16: 85, 17: 94, 18: 93, 19: 108, 20: 88, 21: 82, 22: 65, 23: 37, 24: 22, 25: 17, 26: 11, 27: 8, 28: 2, 29: 1}
\n

And the five dice rolls.

\n
\n\"Five
Five Dice Rolls
\n
\n

We get closer to a normal distribution.

\n

Let\u2019s do it a million times:

\n
for _ in range(1000000):\n    rolls[r() + r() + r() + r() + r()] += 1
\n

Here\u2019s the outcome:

\n
{5: 147, 6: 600, 7: 1890, 8: 4469, 9: 9147, 10: 16052, 11: 26310, 12: 39423, 13: 54070, 14: 69536, 15: 83228, 16: 94310, 17: 99997, 18: 100488, 19: 94484, 20: 83745, 21: 69777, 22: 53880, 23: 39506, 24: 26489, 25: 16144, 26: 9076, 27: 4547, 28: 1900, 29: 649, 30: 136}
\n

And hey look, that looks like a normal distribution to me!

\n
\n\"Million
Million rolls
\n
\n \n\n", "date_published": "2021-05-19T19:55:46-04:00" }, { "id": "gen/file-io.html", "url": "gen/file-io.html", "title": "File Io", "content_html": "\n\n \n \n \n \n \n \n\n File Io\n \n \n \n \n \n \n \n \n
\n

File Io

\n

Date: 2021-05-19T16:41:02-04:00

\n
\n

Table of Contents

\n \n

File I/O is slow: but how slow is it really? Are there any ways we can make it faster? Let\u2019s find out!

\n

First let\u2019s start out by writing the character a to a file in python:

\n
import timeit\n\n\ndef test():\n    with open(f'output.txt', 'w+') as f:\n        f.write('a')\n\n\nif __name__ == "__main__":\n    print(\n        f'This took {timeit.timeit("test()", globals=locals(), number=1)} seconds.')
\n

On my machine, this prints out:

\n
This took 0.0002999180000000032 seconds.
\n

Makes sense. Since it\u2019s hard to look at such small numbers, let\u2019s bump our number of repetitions up to 10000.

\n
import timeit\n\n\ndef test():\n    with open(f'output.txt', 'w+') as f:\n        f.write('a')\n\n\nif __name__ == "__main__":\n    print(\n        f'This took {timeit.timeit("test()", globals=locals(), number=10000)} seconds.')
\n

On my machine, this prints out:

\n
This took 4.582478715000001 seconds.
\n

Let\u2019s try something similar but in memory. Let\u2019s add the string a to an empty string and return it:

\n
import timeit\n\n\ndef test():\n    s = ''\n    s += 'a'\n    return s\n\n\nif __name__ == "__main__":\n    print(\n        f'This took {timeit.timeit("test()", globals=locals(), number=10000)} seconds.')
\n

On my machine, this prints out:

\n
This took 0.0009243260000000031 seconds.
\n

Doing some math, writing to a file 10000 times is 5000x slower than writing to a string 10000 times in memory.

\n

So our intuition (and our Operating Systems textbooks) are correct. Let\u2019s dig deeper to see if we can find anything else.

\n

Intuition

\n

Since all we\u2019re doing is opening a file, writing to it, closing the file 10000 times, maybe there\u2019s some way to speed up this operation.

\n

Let\u2019s build a mental model for how python writes to a file:

\n
    \n
  1. Open output.txt.
  2. \n
  3. Write the character a to output.txt.
  4. \n
  5. Close the file.
  6. \n
\n

Suggestion 1:

\n

Since we\u2019re opening and closing the same file, what if we had some abstraction that represented the file? Let\u2019s say we had some integer that would represent the file (a file descriptor) and we kept track of its state inside of our program. Whenever we need to save our changes to disk, we notify the OS.

\n

So instead of doing:

\n
repeat 10000 times:\n  open `output.txt`\n  clear the contents of `output.txt`\n  write `a` to output.txt\n  close `output.txt`
\n

Which would require us to open the same file 10000 times:

\n

We try this:

\n
file_contents = {}\nfile_contents['output.txt'] = 'a'\nopen file\nclear the contents of `output.txt`\nwrite file_contents['output.txt'] to `output.txt`\nclose `output.txt`
\n

Which would only require 1 call to the OS to open the file, 1 call to the OS to write to the file, and 1 call to the OS to close the file.

\n

Python does this to some degree out of the box: the interpreter keeps a dictionary of file_descriptor -> changes and when it deems necessary, it gives the file changes to the OS.

\n

To make python commit its buffer to the OS, use the flush() function.

\n

Suggestion 2:

\n

What if the OS had a cache too? Since there are many processes trying to access the OS\u2019 resources, the OS has a chance to reconcile file writes and batch them in a way that is more efficient.

\n

Let\u2019s say we ran the same python program twice at exactly the same time. If we only employed caching at the python level, we\u2019d have to write to the same file twice with the character a. Of course, the OS can reconcile those changes and make it so there\u2019s only 1 open-write-close cycle required.

\n

It turns out both of these suggestions are implemented.

\n

To force the OS to propagate a change, you can use the os.fsync(f.fileno()) function. When called, python asks the OS persist the changes in file descriptor f to disk.

\n \n\n", "date_published": "2021-05-19T16:41:02-04:00" }, { "id": "gen/const-correctness.html", "url": "gen/const-correctness.html", "title": "Const Correctness", "content_html": "\n\n \n \n \n \n \n \n\n Const Correctness\n \n \n \n \n \n \n \n \n
\n

Const Correctness

\n

Date: 2021-05-17T19:32:39-04:00

\n
\n

Const correctness means marking all items you can const to prevent unwanted mutation. Let\u2019s say you want to grab a few options from a settings map that you\u2019ve created.

\n

Let\u2019s say you want the time to live value and the created at from a map.

\n

Does this compile?

\n
void setTimeToLive(int ttl) {/* implementation here */}\nvoid setCreatedAt(int createdAt) {/* implementation here */}\n\nvoid getOptions(const std::map<const char*, int> &m) noexcept {\n  const auto ttl = m["ttl"];\n  const auto createdAt = m["createdAt"];\n  setTimeToLive(ttl);\n  setCreatedAt(createdAt);\n}
\n

Nope: since map[] will insert in the case that it doesn\u2019t find a matching key, this doesn\u2019t compile. We can\u2019t insert into a map marked const.

\n

Let\u2019s say we didn\u2019t mark the map as const:

\n
void setTimeToLive(int ttl) {/* implementation here */}\nvoid setCreatedAt(int createdAt) {/* implementation here */}\n\nvoid getOptions(std::map<const char*, int> &m) noexcept {\n  const auto ttl = m["tttl"]; // oops, typo\n  const auto createdAt = m["createdAt"];\n  setTimeToLive(ttl);\n  setCreatedAt(createdAt);\n}
\n

This compiles now, but oh no, what\u2019s the value of ttl? 0. When we access a map\u2019s key that doesn\u2019t exist, we get some default value. In our case, a value of 0.

\n

So we\u2019re at a crossroads. We want our code to compile, be correct, and still allow the map to be const.

\n

Let\u2019s let that happen:

\n
void setTimeToLive(int ttl) {/* implementation here */}\nvoid setCreatedAt(int createdAt) {/* implementation here */}\n\nvoid getOptions(const std::map<const char*, int> &m) {\n  const auto ttl = m.at("tttl"); // oops, typo\n  const auto createdAt = m.at("createdAt");\n  setTimeToLive(ttl);\n  setCreatedAt(createdAt);\n}
\n

We replace map::[] with map::at. map::at does a checked get in the map. If it doesn\u2019t find the key, it throws an exception.

\n

We remove our noexcept from the function because this function can throw and we move on with our lives.

\n

Const correctness saves lives.

\n \n\n", "date_published": "2021-05-17T19:32:39-04:00" }, { "id": "gen/virtual-functions.html", "url": "gen/virtual-functions.html", "title": "Virtual Functions", "content_html": "\n\n \n \n \n \n \n \n\n Virtual Functions\n \n \n \n \n \n \n \n \n
\n

Virtual Functions

\n

Date: 2021-05-16T21:28:00-04:00

\n
\n

While learning Java, I came across this line of code:

\n
List<Integer> array = new ArrayList();
\n

As someone who knows C++, this is somewhat confusing \u2013 List is an interface that ArrayList implements. But if we treat ArrayList as a list, then when we call List.add() we must use dynamic dispatch to find the right implementation, since the implementation of add isn\u2019t on the list interface.

\n

That involves a virtual function lookup.

\n

In Java, most function calls are virtual, unless a method is declared as final (which means it cannot be overriden).

\n

By doing this, we get an advantage \u2013 if we decide to change our array variable from an ArrayList to another List interface conforming type, we can do so without breaking any code.

\n

In exchange, we cannot use any ArrayList specific methods without casting to ArrayList, which nullifies this benefit.

\n

We have information that both ArrayList and LinkedList implement the List interface, so we can treat them as such in a collection.

\n
ArrayList<List> lists = new ArrayList<List>();\nlists.add(new ArrayList());\nlists.add(new LinkedList());\n\nfor (List<Integer> l : lists) {\n    l.add(10); // defer to each lists' implementation for add\n}
\n

Since we know that Java uses virtual functions for interface implementations, everything works as expected. l.add(10) defers to the implementation of ArrayList and LinkedList, and each list is added a 10.

\n

A full example of virtual functions might look something like this:

\n
public class Animal {\n  void move() {\n    System.out.printf("walk %s\\n", this.name());\n  }\n\n  String name() {\n    return "Animal";\n  }\n\n  public static void main(String[] args) {\n    Animal animals[] = {new Animal(), new Bird(), new Elephant()};\n\n    for (Animal animal : animals) {\n      animal.move();\n    }\n  }\n}\n\nclass Bird extends Animal {\n  @Override\n  void move() {\n    System.out.printf("fly %s\\n", this.name());\n  }\n\n  @Override\n  String name() {\n    return "Bird";\n  }\n}\n\nclass Elephant extends Animal {\n  @Override\n  String name() {\n    return "Elephant";\n  }\n}
\n

Running this prints this out:

\n
walk Animal\nfly Bird\nwalk Elephant
\n

Here we create a base class, Animal, which has two methods, name and move. This class is instantiable because it has default implementations for both methods. The Bird class overrides the move method, since it flies instead of walks, and the Elephant only overrides the name. When we collect them into an array and call the move() method on each animal as an animal, Java does the virtual function lookup and calls the correct overriden method as we expect.

\n

It turns out that not every language does this, mainly because there is a runtime cost in dynamic dispatch.

\n

In C++, you must designate a function as virtual which labels a function as overridable by an inherited class.

\n

A roughly word for word translation of the above java program would look like this:

\n
#include <stdio.h>\n\nstruct Animal {\n  virtual const char *name() const { return "Animal"; }\n  virtual void move() const { printf("walk %s\\n", name()); }\n  virtual ~Animal() {}\n};\n\nstruct Bird : public Animal {\n  virtual void move() const override { printf("fly %s\\n", name()); }\n  virtual const char *name() const override { return "Bird"; }\n};\n\nstruct Elephant : public Animal {\n  virtual const char *name() const override { return "Elephant"; }\n};\n\nint main(void) {\n  const Animal *animals[] = {new Animal(), new Bird(), new Elephant()};\n\n  for (const auto &animal : animals) {\n    animal->move();\n  }\n}
\n

This returns:

\n
walk Animal\nfly Bird\nwalk Elephant
\n

In the Java version it isn\u2019t apparent to the programmer that there is a runtime cost, but C++ puts it front and center. Since we used the new keyword, everything is placed on the heap. When we try to call the move() method, we have to do it with the -> operator, which calls an object on the heap, rather than on the stack.

\n

Let\u2019s say we don\u2019t use the new operator, and don\u2019t use a pointer lookup:

\n
int main(void) {\n  const Animal animals[] = {Animal(), Bird(), Elephant()};\n\n  for (const auto &animal : animals) {\n    animal.move();\n  }\n}
\n

This prints out:

\n
walk Animal\nwalk Animal\nwalk Animal
\n

Since we are literally treating each animal as an Animal type, animal.move() calls the default Animal implementation of move. If we choose to have no runtime cost, we get less desirable behavior. But C++ gives you the choice upfront.

\n

Let\u2019s dig deeper.

\n

C doesn\u2019t have any language support for virtual functions, but we can still emulate them.

\n

A (very) rough translation of the java program might look like this:

\n
#include <stdio.h>\n\ntypedef struct Animal {\n  const char *name;\n  void (*move)(const struct Animal *);\n} Animal_t;\n\nvoid fly(const Animal_t *a) { printf("fly %s\\n", a->name); }\nvoid walk(const Animal_t *a) { printf("walk %s\\n", a->name); }\nvoid animal_move(const Animal_t *a) { a->move ? a->move(a) : walk(a); }\n\nint main(void) {\n  const Animal_t animals[] = {\n      {.name = "Animal"}, {.name = "Bird", .move = fly}, {.name = "Elephant"}};\n  const size_t animals_len = sizeof(animals) / sizeof(Animal_t);\n\n  for (int i = 0; i < animals_len; i++) {\n    const Animal_t animal = animals[i];\n    animal_move(&animal);\n  }\n}
\n

C actually lets us do some interesting things here: It allows us to allocate our animals on the stack. As well, we create an animal_move function that asks the struct passed in if it has a function pointer for move. If it does, then it defers to that, otherwise it calls a default implementation. If we do choose to use a more specific version of move, then we do have the pointer lookup cost, but if not, there is no cost.

\n

Predictably, this prints:

\n
walk Animal\nfly Bird\nwalk Elephant
\n

Digging a little up, we find that Rust has a similar concept, but it does away with using keywords like virtual or final to designate dynamic dispatch.

\n
pub trait Animal {\n    fn name(&self) -> String {\n        "Animal".to_string()\n    }\n    fn act(&self) {\n        println!("walk {}", self.name());\n    }\n}\n\nstruct GenericAnimal {}\nimpl Animal for GenericAnimal {}\n\nstruct Bird {}\nimpl Animal for Bird {\n    fn name(&self) -> String {\n        "Bird".to_string()\n    }\n    fn act(&self) {\n        println!("fly {}", self.name());\n    }\n}\n\nstruct Elephant {}\nimpl Animal for Elephant {\n    fn name(&self) -> String {\n        "Elephant".to_string()\n    }\n}\n\n\nfn main() {\n    let animals: Vec<&dyn Animal> = vec![&GenericAnimal{}, &Bird{}, &Elephant{}];\n    for animal in animals {\n        animal.act();\n    }\n}
\n

We create a trait of animal (kind of like an interface, but supercharged) and then we create an instantiable version of it (GenericAnimal) that just takes the default implementation. Then we implement our Bird and Elephant and we collect them into a vector and properly call the method on them. The compiler assumes that we want dynamic dispatch by default and does it for us. If not, we can call the parent classes\u2019 method:

\n
Animal::act(&Bird); // this calls the Animal version of `act` with a bird.
\n

This is similar to C++:

\n
const Bird* bird = new Bird();\nbird->Animal::move(); // calls Animal::move with bird.
\n

In short, here\u2019s a history of virtual functions and their syntax, starting from C and ending with Rust:

\n

In C, there\u2019s a rough way to emulate virtuals, but it takes some effort since it\u2019s not built into the language.

\n

In C++, virtual functions were deemed useful enough to be built into the language. This led to a terser syntax for overriding, but led to more cognitive load on the programmer, since they had to choose virtual or non-virtual implementations.

\n

In Java, most methods are by default virtual, so the implementation details are hidden from the programmer. Java allows you to declare non-overridable methods as final, so final methods have no runtime cost, and all @Override methods have runtime cost, which is a fair tradeoff.

\n

In Rust, the compiler figures out if you want a virtual or normal method call through your trait implementations, but allows you to call the base method through a subclass if you so choose. This allows for having even less cognitive load than in Java (no @Override or final necessary), but with an escape hatch to call the base class method in an overriden type (as C++ allows).

\n \n\n", "date_published": "2021-05-16T21:28:00-04:00" }, { "id": "gen/the-strong-force-and-weak-force-of-companies.html", "url": "gen/the-strong-force-and-weak-force-of-companies.html", "title": "The Strong Force and Weak Force of Companies", "content_html": "\n\n \n \n \n \n \n \n\n The Strong Force and Weak Force of Companies\n \n \n \n \n \n \n \n
\n

The Strong Force and Weak Force of Companies

\n

Date: 2020-08-18T13:19:06-05:00

\n
\n

There are four fundamental interactions in physics; gravity, electromagnetism, the strong force, and the weak force. Particles are affected by all four of these \u2013 hence why they are fundamental to understanding physics. The strong force holds the nucleus of an atom together. The weak force tries to tear the nucleus of an atom apart, creating radioactivity.

\n

The strong force is a million times stronger than the weak force at small distances, up to 10 angstroms. However, as you add more protons and neutrons to the nucleus of an atom, the atom gets bigger - and the weak force becomes stronger and stronger, until eventually the nucleus of an atom is too big to be kept together by the strong force, and the weak force tears the atom apart, resulting in a radioactive atom. Companies follow the same trajectory.

\n

Companies are a lot like atoms; at first, growth is key \u2013 you want to extend your revenue by having people buy the service you offer them. How do you build a better service? By hiring. So you hire people to improve your service. At first, this works excellently. Your first few hires have great impact at the company, and they lay the foundation for hires in the future. Of course, it\u2019s not all roses \u2013 they make some mistakes, and put in guardrails to make sure future employees don\u2019t make the same mistakes. But at this point of the curve, it is worth way more to you as a company to continue hiring, since each hire gives you more value than you pay them. But as time passes and you hire more people, each hire has less impact. They\u2019re encumbered by the communication cost of talking to multitudes of stakeholders, and the cost of checking all of the guardrails their predecessors laid down for them. Eventually work slides to a snail\u2019s pace, and it shows \u2013 the product doesn\u2019t improve very much, if at all \u2013 features that might take a few days take months at a time now, and your revenue doesn\u2019t increase as much as it used to. You\u2019ve hit your first slope. Your competitive advantage, your strong force is fading. Everyone else is coming for you. The weak force is strengthening.

\n

You have a few ways of delaying the inevitable \u2013 maybe you find a novel way to manage resources so you become better than the competition. Maybe you realize that some of the rules you\u2019ve enshrined don\u2019t do much for you, so you cut excess baggage. Maybe you can\u2019t do the above, and you cut excess resources in the form of employees. You\u2019ve managed to keep the competition at bay, but your strong force weakens as you do this. If you manage resources to become better than the competition, what stops the competition from copying you? If you cut excess baggage, you run the risk of cutting out rules that were well-intentioned and protected you from damage. You run the risk of plunging your company into a dark age, where the employees have to rediscover the practices that were purged from the annals of history to resume work. If you cut employees, you risk the loss of morale that keeps your company churning. If the company is willing to axe some people, what stops it from axing the rest of them? Each time you do this, you come closer to your doom by the hands of the weak force.

\n

A company that keeps delaying the inevitable soon falls to its own forces. It blows up, metaphorically. People stop wanting to work at the firm because there\u2019s too much bureaucracy. The product stop dominating the market, and revenues fall. The consumers drop you for your competition. Your company blows up, becoming a waste zone.

\n

Business is all about keeping the weak force away. As we\u2019ve made progress in business, we\u2019ve learned how to do this better. Companies find ways to make more revenue with fewer people, and do more with less. But this too shall pass \u2013 the companies of yesteryear will soon be left in the dust. And so on and so forth. I wonder what the future of organization has in store for the rest of us?

\n \n\n", "date_published": "2020-08-18T13:19:06-05:00" }, { "id": "gen/the-business-curve.html", "url": "gen/the-business-curve.html", "title": "The Business Curve", "content_html": "\n\n \n \n \n \n \n \n\n The Business Curve\n \n \n \n \n \n \n \n
\n

The Business Curve

\n

Date: 2020-08-08T10:25:54-05:00

\n
\n

In Zero to One, Peter Thiel proclaims \u201cCompetition is for losers.\u201d \u2013 if you can make a monopoly, you should, because you can extract far more profit from it than if you participated in an open market.

\n

In a free market, with perfect competition, you face this supply and demand curve below. You have to sell the good at the price where supply meets demand, and you can sell up to but not more than quantity where demand meets supply. Assuming you\u2019re not the only seller of the good, you have to settle for some chunk of the market.

\n

\n

Assume there are 5 merchants, and the amount of revenue you can bring in is the square created by the area of where supply meets demand, you get something like this:

\n

\n

All of you split the market somewhat equally, and you make 0 economic profit. Nobody can sell the good for more than it\u2019s worth (opportunity cost + cost of production), and nobody will buy the good for more than it\u2019s worth (opportunity cost + cost of production). The perfectly free market makes buyers and sellers equally well off.

\n

\n

This assumes that the first few products that a firm sells on the open market are the cheapest for it to produce, and that there are buyers that are willing to pay more for the product than others, because it is inherently worth more to them. The firms get to sell the goods that cost less for them to make at a higher price, and the buyers who would\u2019ve bought the good at a higher price to get to pay less for the good. Everyone wins a little bit.

\n

But firms want to make economic profit. This is where the real money comes in. You can do this in many ways; organizations like OPEC (the Organization of the Petroleum Exporting Countries) band together every year and restrict the supply of oil on the open market. This lets them charge more for their good (oil), since artificial scarcity drives up the price of oil.

\n

Because of this, the market for oil looks something like this:

\n

\n

The utility that both sides get is:

\n

\n

Nice! We\u2019ve made economic profit. What we\u2019ve done here is taken a large slice of the pie for ourselves; the buyers lose some utility, but in exchange the sellers get more utility. However, we\u2019ve lost some total utility. The area to the right of the seller\u2019s utility gain is now white. That\u2019s utility that we\u2019ve lost. This is called Deadweight Loss, because OPEC has now made some utility inherent to the market unreachable. The market as a whole could be better off and capture that white area if OPEC decided to sell oil at the market competitive price.

\n

OPEC made the market better for the sellers, in exchange for hurting the buyers, and worse for the economy as a whole.

\n

In this case, OPEC is an oligopoly, or a case where there are only a few sellers of a good. A monopoly is when there\u2019s only one, and a duopoly is when there are only two. All of these traditionally generate deadweight loss, generating less utility for the buyer and more for the seller(s). This is why you hear how monopolies and oligopolies are bad and should be broken up by the government.

\n

But OPEC isn\u2019t a maintainable oligopoly. Saudi Arabia doesn\u2019t trust the other members of OPEC. Russia doesn\u2019t trust OPEC. In fact, OPEC doesn\u2019t trust itself. The reason is simple: As a seller, you could make more money if you undercut the market and sell more oil at a lower price. If I sell oil at a lower price than the other countries, I get all of the profits. That entire red rhombus I had to share with the other sellers? All mine. And I expand it a little bit too. The rhombus gets fatter for me, at the expense of all of the other sellers. Oligopolies have a cooperation problem \u2013 the economy cannot enforce its sellers to sell a good at a higher price than it values said good. So Russia, Venezuela, Saudi, and so forth go to each OPEC meeting wondering if this is the time the other countries backstab them and make off with all the gold. In OPEC\u2019s case, all of the countries have to be well armed too, because they don\u2019t know when the other countries might come to capture their oil to increase their profits. Cooperation works worst when all parties are paranoid.

\n

The problem with this model of economics is that it only somewhat applies to some goods in some places. This assumes we have symmetric information \u2013 that sellers know just as much as buyers. This also assumes that our goods are private and no different than each other. But really, every good is somewhat different; coke isn\u2019t exactly the same as pepsi, even though they are similar products. This model also doesn\u2019t count network effects, where making a good become more popular has benefits to all of its users and its overall utility.

\n

If I wanted to buy some British soda, I would have to import that soda to the US in order to buy it. That costs me both time and money; and so the soda costs more than it would if I simply wanted a coke. If I wanted a coke, I could just walk down the street since every grocery store carries it, because of its popularity. Coke is cheaper than this British soda, Iron Bru.

\n

A service like Uber really wants a monopoly for that reason; because if it had a monopoly, it would be able to have all of the drivers on its platform available to serve all of the riders, and thus be able to get economic profit. Alas, Lyft exists; and so, there are only half the available drivers on Uber and half the available riders on Uber. Uber and Lyft are then also forced to price their service lower and lower to beat each other out and win the loyalty of their riders. Buyers win here, but the sellers are engaged in a race to the bottom. And only one will survive, a la highlander.

\n

If we assume that a monopoly is the only way to go to gain monopolistic profits, we need to find a way to do so, without using force (because that breeds paranoia). The only way out is by making a better product. If we make a better product, we can create a market that didn\u2019t exist before, and gain monopolistic profits from that market. If this product cannot be replicated, then there is no incentive to stop the monopoly; because the market will only be worse off. Thus, no use of force in any way can stop us from gaining monopoly profits. But building the next Google is hard. Incredibly so.

\n

You can, however, build a product that has differentiation in some way. You must have a feature that is hard and useful, and you too can ride a smaller but also profitable monopoly curve for profit.

\n \n\n", "date_published": "2020-08-08T10:25:54-05:00" }, { "id": "gen/language-pessimism-and-optimism.html", "url": "gen/language-pessimism-and-optimism.html", "title": "Language Pessimism and Optimism", "content_html": "\n\n \n \n \n \n \n \n\n Language Pessimism and Optimism\n \n \n \n \n \n \n \n
\n

Language Pessimism and Optimism

\n

Date: 2020-04-24T16:31:51-05:00

\n
\n

Table of Contents

\n \n

Let\u2019s talk languages. Not languages like Swahili or Latin, but languages like C or Java. Programming languages. I would argue that the programming language you use day to day is the most important tool you use in programming \u2013 if you use a modern language, it offers a lot of punch compared to something like assembly. The programming languages of today are cross platform, which gives you compatibility, meaning you don\u2019t have to worry about writing your code for every platform or architecture, because your language takes care of it for you. It gives you abstraction, because you can write code that takes a different shape than the way the machine sees it. Machines can\u2019t understand for loops or if statements, or even really anything more than 0s and 1s. We made up things like functions and classes and interfaces and stuff like that. And these abstractions are really good. I\u2019ve never wondered when a for loop or if statement didn\u2019t work as intended in the language. Programming languages also give you safety \u2013 they prevent you from doing bad things with your program. If I try to cast to a non-compatible type, my compiler might catch that \u2013 if I try to write to some memory I don\u2019t own, my language or Operating system will stop that. And that\u2019s just scratching the surface \u2013 there\u2019s plenty more to love about languages. Higher order functions! Algebraic Data Types! Optimizations! Syntactic Sugar! It\u2019s a joyride all the way down. But we\u2019re not going to talk about the nitty-gritty type theory of languages, no, today we\u2019re going to be talking about how people view languages in programming, by doing everybody\u2019s favorite thing \u2013 dividing people into two kinds of people, the x\u2019s and the y\u2019s, or the pessimists and optimists.

\n

The Pessimists

\n

The pessimists tend to say something along these lines quite often: \u201cGood code can be written in any language, and bad code can be written in any language. It\u2019s all about discipline when writing code.\u201d COBOL is as good as Java. People made great systems in COBOL, just like they made great systems in Java. And it\u2019s true. Much underpinning financial systems were written in COBOL. Many COBOL systems are still out in the wild running, decades after they were written. And to the pessimists\u2019 credit, they work just fine. After all, if a language is Turing complete, it is as powerful as any other language. Basically all popular languages are Turing complete, so they\u2019re theoretically equivalent in power. Some languages are just better at certain things, but they can all express the same things.

\n

The Optimists

\n

The optimists might say that languages are different. An optimist might say that they prefer C++ to C because it has collections and Object Orientation. An optimist might enjoy using smart pointers to aid in memory management. An optimist might like the way that python handles iterable collections. These all aid in code readability, because the abstraction is easy to understand and encapsulates more functionality in fewer lines. Maybe an optimist likes static types, because they make it explicit what types of data a function or method might return, or make it easier to understand the way data transforms throughout its lifecycle. In the optimists\u2019 eyes, there are abstractions which aid in the expression of code, and since different languages choose different abstractions to make the expression of some types of programs harder and some easier, there are some differences between languages. You can program in an object oriented style in C. You can program in a generic style in C. These are all far less compact than their equivalent C++. There is an aesthetic difference, if not theoretical difference.

\n

A Twist

\n

If you follow my points up above, I\u2019m promoting the idea that language optimists prefer the aesthetics of a language, whereas the language pessimists prefer the theoretical power of a language. Theoretical power of a language is not bounded by expressiveness, however. If it took you 100 lines of code to read from a file in a toy language (let\u2019s call this language Airplane), but 1 line of code to read from a file in another language (let\u2019s call this language Ship), Ship and Airplane would be theoretically equal in power. But in practice, most people would prefer to use Ship for reading files. 1 line of code is 100x less than 100. If this was the case for everything (let\u2019s say it took 100x more lines to write any possible program in Airplane than in Ship), you would probably prefer Ship. In terms of expressiveness, Ship is better than airplane.

\n

Ship is still not more powerful than Airplane; it is simply more expressive.

\n

But that\u2019s still just aesthetics; what if there was a class of problems that Ship did not suffer from that airplane suffered from. Let\u2019s say that Airplane and Ship are both Web programming languages, and if you fail to write \u201csafe\u201d Airplane code, you might allow some users to read information that belongs to another user\u2019s. Let\u2019s say in Ship, this problem doesn\u2019t exist \u2013 Ship wouldn\u2019t let our code compile if it could detect the possibility of this bug.

\n

Even with this added on, Ship is still not theoretically more safe than Airplane; a correctly written Airplane program is as safe as a Ship program that compiles. But I would say in this case, most programmers would prefer Ship. Even though Airplane is theoretically as safe as Ship, a language that provably lacks a class of errors is at least marginally better than a language that may probabilistically have a class of errors.

\n

Now the twist is, is Ship in a different class of language compared to Airplane? Does safety matter?

\n

Choose your own adventure

\n

It\u2019s up to you. Theoretically safety doesn\u2019t play a part in programming. God awful safety vulnerabilities and memory leaks don\u2019t matter because, while they cause us and our users to be sad, and might even cause us to go to jail, they don\u2019t make our program any less powerful. A program doesn\u2019t care if you go to jail or not, actually, and neither does math. If you are concerned with this definition of power, then Airplane and Ship are indeed equivalent. But maybe you have a more practical bent. If so, Ship is probably the better language \u2013 hey, you can express yourself with 100x less code and it has more safety built in that Airplane.

\n

Is Safety Preference?

\n

That leads me to my last point. Is Safety Preference? Is a safer language a categorically different language than one that an unsafe language? I would argue so. Most programmers these days will probably never touch a memory unsafe language (C and C++ being the most popular of these). There\u2019s a whole extra class of errors to account for in these two languages, and when you choose a garbage collected language, most of the time you don\u2019t have to worry about memory. That helps you out with delivering better safe software. And that\u2019s great. It makes you more productive. But maybe you need to have performance instead of safety, and this is where most people go for performance. Safety comes at some cost, and you can\u2019t always have safety. But if you can have safety, you might as well buy into it, since it lessens your cognitive load.

\n \n\n", "date_published": "2020-04-24T16:31:51-05:00" }, { "id": "gen/performance-matters.html", "url": "gen/performance-matters.html", "title": "Performance Matters", "content_html": "\n\n \n \n \n \n \n \n\n Performance Matters\n \n \n \n \n \n \n \n
\n

Performance Matters

\n

Date: 2020-03-29T13:51:41-05:00

\n
\n

Asking programmers about performance generally leads to two trains of thought. Either it doesn\u2019t matter, and there are other metrics to chase, or performance matters. Most of the time, those who think that performance matters can be broken down into two different streams of thought. One group thinks that performance matters because it saves money \u2013 if your server costs are a million dollars a month and you make the whole system 10% more efficient, you\u2019ve reduced server costs by about a hundred thousand dollars. It\u2019s nice to save that money. The other group is more intense about it \u2013 performance is the only thing that matters because it allows us to create programs that solve problems that other (less efficient) programs would simply not allow us to. Plenty of hot-path code is written in low-level highly optimized languages.

\n

All three ways are true in sometimes: sometimes you don\u2019t need to look at code under a microscope because the underlying runtime will optimize it, sometimes saving money may save the company, and sometimes performance is the only thing that matters. But I\u2019m going to say that performance has and always will matter because of the fact that hardware will always get smaller, and we want to do more with less. It turns out that there is a great analogue for this in chemistry already.

\n

In chemistry, atoms are held together by gravitational forces. But there are two forces at work, the strong and weak force. The strong force keeps the nucleus of an atom together, whereas the weak force tries to disperse the protons and neutrons of the atom. As you\u2019d expect, the strong force is stronger than the weak force for smaller elements. But this begins to break down as atoms get bigger \u2013 the strong force is not asymptotically stronger than the weak force. As such, there are radioactive elements like Uranium-235, where the weak force overpowers the strong force and causes the element to become radioactive. Radioactive elements eventually tear themselves apart, and split into smaller elements at the end of their half-life.

\n

Hardware has historically (and I don\u2019t think this will change much) been the same way. At first, computers like the ENIAC were the size of a house, (wikipedia says 1,800 sq ft), and consumed 150kW of electricity to perform a whopping three square root operations per second. At first, this is wholly insufficient for most of the work needed to be done on a computer. So research was done to optimize hardware to the point where it was able to be made smaller. That led to the micro-computer, a computer the size of a desk, which could do some interesting tasks like word processing or simple text games. Eventually we found that we had enough performance that we could stick a micro-computer into a smaller computer, one that was portable, and that was called a laptop. Then the hardware advanced to the point where it could fit into your pocket, and that became the smart phone. Tablets branched out to be in the middle of size (and therefore compute) of a laptop and a smart phone. Our need for smaller and more efficient hardware has made it so we couldn\u2019t throw performance by the wayside; every time performance was \u201cgood enough\u201d, the hardware got smaller until performance mattered again.

\n

If you believe history will continue, then we\u2019ll have even smaller devices with even more strict real time capabilities. Doctors have been asking for more efficient tools to treat patients with for a very long time, for good reason. More portable and cheaper tools for doctors allow them to treat more patients at less cost. More powerful tools allow for doctors to find signs that they might\u2019ve missed with less powerful tools. A new generation of video gamers want to bring their games on the go \u2013 as such, the mobile video game industry has blossomed apart from the traditional console based games. VR headsets are now being used for therapy, and this is extremely performance sensitive \u2013 if the program on a VR headset is slow by even milliseconds, the user may become fatigued or nauseous, which defeats the purpose of the technology.

\n

History has always been about doing more with less, and I don\u2019t see that changing for the near future. Maybe when Moore\u2019s law stops becoming true?

\n \n\n", "date_published": "2020-03-29T13:51:41-05:00" }, { "id": "gen/who-gets-stuff-done.html", "url": "gen/who-gets-stuff-done.html", "title": "Who Gets Stuff Done", "content_html": "\n\n \n \n \n \n \n \n\n Who Gets Stuff Done\n \n \n \n \n \n \n \n
\n

Who Gets Stuff Done

\n

Date: 2020-03-12T11:52:45-05:00

\n
\n

Table of Contents

\n \n

Software developers are generalists. Ask the average software developer questions about networking, databases, compilers, operating systems, data structures, distributed systems, and 9 out of 10 will tell you that they know something about them. But this generalist mentality begins to break down once you acknowledge one of the most fundamental things rules of life:

\n
    \n
  1. Nobody can be an expert at everything.
  2. \n
\n

And if nobody can be an expert at everything, we eventually must have roles for our field. One mechanic alone cannot create a high performing and standards compliant car these days. It\u2019s just too hard. Likewise, software development is too hard for any one person to keep the whole field in their head. And that\u2019s a good thing. It means that we have roles for who does what, and that helps teams move quickly, without getting bogged down in the minutae.

\n

So we have well defined roles for the work that we do on our teams, like Front-End developer, Back-End developer, DevOps, SysAdmin, Designer, Product Manager. All is dandy in the world. Well, all would be dandy if there wasn\u2019t a question of what makes an developer a developer.

\n

Once a field becomes sufficiently mature, the practioners of the craft, (the real intense ones anyway) start sharing a common idea of the ideal practioner. First this starts out as vague and easy to achieve ideals, like \u201ca real developer would be able to answer fizzbuzz in 30 seconds\u201d, or \u201ca developer should be able to understand one programming language well\u201d, which is all hunky dory. And soon, a regulatory body of practioners, filled with the people who subscribe to that ideal the best is borne, and they create norms and disseminate them to the rest of the practicing population. Psychologists in America have the APA, doctors to pass a board certification, and lawyers have to pass the bar in each state. I\u2019ll call the idea of the ideal practioner the \u201csoul\u201d of the field, and the regulatory body (a la APA, bar, or other) the \u201cbody\u201d. If all is in harmony, the \u201cbody\u201d and \u201csoul\u201d are in alignment \u2013 the practioners agree with the higher ups on what should be taught, and what constitutes a \u201creal\u201d practioner.

\n

The system I\u2019ve described above works great when \u201cbody\u201d and \u201csoul\u201d are in tune \u2013 there is buy-in from both sides of the table. But programming has never had that. And it\u2019s because the field is too varied, one that should\u2019ve been split up (and probably will be) in the coming decades. You see, development doesn\u2019t fit so neatly like other professions do in this model, because at large, there aren\u2019t two groups of programmers. There are three. Academics, Systems Programmers, and Application Programmers.

\n

In computer science, academics do research on problems that are remarkably forward thinking \u2013 as you read this, papers are being published for increasing the speed of low-level hardware, operating system calls, database reads and writes, distributed systems, programming langauges, AI, statistics, quantum computing, cryptography, what have you. These all have wide impact, maybe even decades later \u2013 Barbara Liskov (of the Liskov substitution principle) was working on object oriented languages in the 60s, some 30 years before they made it to the mainstream for practioners in the form of java. Liskov as well as Lampert made long lasting contributions to distributed systems research, which has changed the way practioners have built their infrastructure, and has allowed companies like google and amazon to become global companies. RSA encryption, made in the 70s, is widely used today. There is a long tail of important and groundbreaking research, but you get the gist \u2013 Academics do important research.

\n

Systems Programmers are the ones who, when they see a reason to, implement the academic\u2019s work. Linux implements some of the cutting edge of operating systems research. Zookeeper implements the consensus algorithms that academics envisioned from the 70s. OpenSSL implements RSA encryption, and many others. System programmers transfer the abstract, theoretical world of theorems and proofs into libraries and packages for the rest of us to use.

\n

Application programmers are the ones who take the work of System Programmers and create products and applications that are used by the world at large (read: not tech-savvy people). They deal with the nitty gritty of presentation and User Experience, and find creative ways to use the tools they have to make products that require very little in-depth knowledge of the product to use.

\n

With these three camps, it is impossible to have either a \u201csoul\u201d or \u201cbody\u201d for programmers. I\u2019ll list out the ideal \u201cbody\u201d and \u201csoul\u201d for each of the three camps.

\n

Academics:

\n
    \n
  • Body:\n
      \n
    • An academic body that tests developers on the rigor of their proofs and theoretical knowledge.
    • \n
  • \n
  • Soul:\n
      \n
    • A practioner that thinks from first-principles to expand the rigor and breadth of the field.
    • \n
  • \n
  • Education:\n
      \n
    • A higher education (Masters or PhD)
    • \n
  • \n
\n

Systems Programmers:

\n
    \n
  • Body:\n
      \n
    • A coalition of workers who stress low-level (in code) fundamentals, and tool building for performance.
    • \n
  • \n
  • Soul:\n
      \n
    • A practioner who doesn\u2019t necessarily need to produce academic research, but can pick up academic works and translate them to libraries and packages.
    • \n
  • \n
  • Education:\n
      \n
    • A degree in the field (Undergrad, Masters, or PhD)
    • \n
  • \n
\n

Application Programmers:

\n
    \n
  • Body:\n
      \n
    • A group of programmers who build products for the general population.
    • \n
  • \n
  • Soul:\n
      \n
    • A practioner who can pick up a wide variety of tools, and knows how to use them to quickly create the required product.
    • \n
  • \n
  • Education:\n
      \n
    • Varied.
    • \n
  • \n
\n

All three groups are in constant conflict, and this leads to the chaotic state of software development \u2013 at one end, the Academics aren\u2019t even implementing software \u2013 and at the other end, Application programmers stress a high-level knowledge of a variety of areas. Agreement isn\u2019t necessary in this case, but it is something that would benefit largely in one area. Interviews.

\n

The Hiring Bar

\n

After being interviewed for days by Bell Labs, a young up-start computer scientist named Bjarne Stroustrup found a job at one of the most coveted research labs in the nation.

\n

After being interviewed for weeks by an Unnamed Big Firm, a young up-start new graduate named ${GENERIC_NAME} found a job at one of the most coveted firms in the nation.

\n

See any relation? You should, because I made it painfully obvious. Big firms subject prospective candidates to a brutal interview loop, constituting of knowledge of low-level operating systems, compiler theory, algorithms and data structures, distributed system design, and others. This all makes sense if you want a software developer who will work on all of these areas, but most firms do not actually need their candidates to know this knowledge on the job \u2013 it is abstracted away from them by the work of Systems Programmers who provide (mainly) good libraries to base work off of. Maybe Untitled Big Firm has does have problems of scale \u2013 but your average start-up does not. And yet, the hiring process continues this way.

\n

The big firms might be justified for want of unicorn talent (after all, they\u2019re willing to pay for it), but most firms simply cannot afford to pay the compensation that these developers are worth these days. And yet, the hiring process continues, and I hear companies complain on LinkedIn about how hard it is to hire and retain good developer talent. To those companies, I only have a few choice words: buck the norms, and fix your interview process.

\n

I\u2019ve heard countless stories of acquaintances not passing an interview because they were asked questions that were outside of their specialization, which matched the job. One acquaintance with an interest in robotics and hardware was asked about implementing a Todo CRUD app. He failed. Another friend was asked about low-level disk write system calls for a React position.

\n

I think this happens because companies have a mistaken perspective of \u201cwho gets stuff done\u201d, A.K.A \u201c10x engineers\u201d. They assume that to be a \u201cgreat\u201d developer, you must understand everything about your computer, and that translates to great code. That is not true. A good developer knows the appropriate level of abstraction for the task at hand. Asking a systems programmer about application programming, or vice-versa, is a surefire way to destroy your hiring pipeline. The best companies know this, so they don\u2019t do that. They hire \u201c1x\u201d engineers, and make it to market \u201c10x\u201d as fast as the other firms. Those firms win. The product people love the devs because they crank out bug-free code in record time, and the customers love those companies because they make genuinely good products.

\n

Always value getting stuff done.

\n \n\n", "date_published": "2020-03-12T11:52:45-05:00" }, { "id": "gen/10-predictions.html", "url": "gen/10-predictions.html", "title": "10 Predictions", "content_html": "\n\n \n \n \n \n \n \n\n 10 Predictions\n \n \n \n \n \n \n \n
\n

10 Predictions

\n

Date: 2019-12-26T19:56:17-05:00

\n
\n

Table of Contents

\n \n

In a few more days, the 2010s will be over. Lots has changed in the programming world \u2013 Java is no longer king, but JavaScript, being the most popular language (according to stack overflow) for 7 years straight. NoSQL databases like MongoDB, Redis, and Cassandra have become exceedingly popular, as well as front end web technologies such as Angular, React, and Vue. Kotlin has become the preferred language for Android development, along with Swift for iOS. SaaS companies are ubiquitous, and Marc Andreessen\u2019s prediction of \u201csoftware eating the world\u201d rings even more true today.

\n

Let\u2019s hope that the 20s are an even wilder ride for software development, and to that end, here I\u2019ve decided to compile ten predictions for the next decade as a fun little exercise. As a reader of this article, I encourage you to do the same (I\u2019m looking forward to seeing how our predictions are different or similar!)

\n

Most of these predictions will be wildly incorrect, but I think this is a good excuse to think about what could be coming in the future.

\n

1. Self-driving cars will still be two years out

\n

Before anyone asks, I\u2019m talking about level 5 meaning fully autonomous, you could take a nap in the backseat of the car with no driver autonomous. There is a lot of interest in this space, and for good reason \u2013 Large companies like Uber, Lyft, Waymo, and Tesla have been researching self driving cars for the good part of this decade. There are many technical concerns regarding self-driving cars, but I\u2019m actually fairly sure they\u2019ll be solved this decade. Legality is a huge gray area. Should self-driving cars never have an accident? If that blocks general availability, self-driving cars will never make it on the road. But if the government allows self-driving cars as long as they have less accidents than human drivers, I think there\u2019s a shot they make it this decade. But I think the real problem is that regulation will be lagging behind.

\n

2. Rust will become a top 10 language in popularity

\n

According to 2019\u2019s Stack overflow\u2019s developer survey, the tenth most popular language is TypeScript, which has 21.2% of respondents professing usage. Right above that is C++ at 23.5%, and right below is C at 20.6%. Rust is currently at 3.2%, sitting just below Scala. Of the ten languages, Typescript is the only one younger than a decade, but it has already edged out C in popularity. Rust is the only language that can save us from C and C++ supremacy in high performance computing. It has a couple of famous backers (like Mozilla and Amazon), and it has won most loved language for 4 years running on Stack Overflow. Allowing access to low level computing without unsafe abstractions is a real treat. Every generation of programmers flocks to a new way of doing computing \u2013 C saved us from assembly, and C++ followed to tack on object oriented programming and RAII. Java popularized garbage collection, and languages such as Javascript, Python, and Ruby added higher level functional abstractions to the mainstream. I think this next decade will see the rise of Swift, Kotlin, Go, and Rust to the top 10 of languages.

\n

3. WebAssembly will kill JavaScript and Desktop Apps

\n

I don\u2019t mean kill kill, like how C did away with Fortran. I expect to see JavaScript as a top 5 language still by the end of this decade, but I think WASM is too much of a game changer to ignore. Applications with higher performance requirements are gated from the web because JavaScript is the only front-end programming language \u2013 you simply can\u2019t have a garbage collected interpreted language be high performance. WebAssembly changes all of that. Compile Rust, Go, C or C++ for the front end. Games, exiled to desktops after the death of flash, can come back to the web. Developers with high CPU requirements (like AI/ML apps) will most likely find their home on the web this decade. I expect something similar to npm popping up, but for all kinds of packages in all kinds of languages, widening the range of the web.

\n

4. JSON will be replaced with a Typed Transfer Protocol

\n

JSON has been around for 20 years \u2013 but I don\u2019t expect it to be popular for another 20. While I enjoy working with JSON APIs, I think that choosing a JavaScript based transfer protocol is a double edged sword. Sure, it rose in popularity because it\u2019s just like Javascript, but JavaScript is fast and loose, something not all programmers appreciate. Tools to facilitate typing and strictness have popped up for JSON, but sometimes it\u2019s better to start from the ground up. I expect something like YAML with strict typing becoming more of the standard by 2030.

\n

5. Functional Programming will finally become popular

\n

Programming in a functional style has become all the rage recently, but people still haven\u2019t adopted functional languages into their toolkit. Of the three tenets of functional programming, most mainstream languages have accepted two, functions as first class citizens, and stronger typing. Immutability is hard to implement if all of your data structures are mutable, so that\u2019s a non-starter. I just want to be able to talk about algebraic data types at a meetup without being an outcast, darn it!

\n

6. Microservice hype will wear off

\n

This one will probably seem crazy to half of the readers, and obvious to half of the readers. Microservices are great because they encourage looser coupling. Unfortunately, looser coupling also requires more code. And the worst thing you could do to your codebase is increase the amount of code it has. If monoliths create tech debt gradually by entangling everything in their grasp, the sheer amount of code microservices introduce will blanket your entire organization.

\n

7. Facebook will no longer be a top 10 company

\n

Facebook made one good product 15 years ago. The other two products driving most of their profits, Instagram and WhatsApp were acquisitions. Among the youth, Facebook is unhip and ancient. You know, just like commodores are artifacts of a bygone epoch. Zuckerberg is smart, but I don\u2019t think good acquisitions (which saved Facebook this decade) will save them the next decade. We\u2019ll see though.

\n

8. An AI startup will become this decade\u2019s hottest startup

\n

Last decade\u2019s hottest startups, like Uber, or Lyft, or AirBnB created the gig economy. I expect AI to try to coax the gig economy into its coffin.

\n

9. Blockchain will be this decade\u2019s beanie babies

\n

Blockchain has been gaining ubiquity as a secure new way to exchange funds, but I don\u2019t see it taking off just yet - It strikes me as an idea that is too early for its current decade.

\n

10. You and I will host our apps on a new cloud provider (read. not AWS)

\n

AWS became extremely popular this decade - and I don\u2019t expect the service to die this decade. While it\u2019s great for enterprise by thinning the IT department, it\u2019s not made for you and me. It\u2019s confusing, for one, with configuration hiding behind every corner, ready to jump out and spook you. Oh yeah, and it costs a lot.

\n \n\n", "date_published": "2019-12-26T19:56:17-05:00" }, { "id": "gen/software-engineering.html", "url": "gen/software-engineering.html", "title": "Why are there so many Software Engineers?", "content_html": "\n\n \n \n \n \n \n \n\n Why are there so many Software Engineers?\n \n \n \n \n \n \n \n
\n

Why are there so many Software Engineers?

\n

Date: 2019-04-14T20:56:58.198Z

\n
\n

Table of Contents

\n \n

Hardware vs.\u00a0Software

\n

Have you ever wondered why so many people are software developers instead of hardware developers these days? I certainly have. And even if you haven\u2019t, I\u2019ll go ahead and give you a cut and dry example of why that is, and why software is considered to be the way of the future (for now). I\u2019ve never studied anything about either hardware or software, to be fair \u2013 hardware is black magic and software is a black box, so take my narrative and reasoning with a grain of salt.

\n

What is Software

\n

First off, we\u2019ll have to define what software is. I really don\u2019t have a great definition to give you, but I\u2019m sure you know what software is anyway \u2013 it\u2019s anything above hardware, and hardware is anything below software. I jest, of course. Software is defined as operating information used by a computer. These days, any kind of programming that builds on an operating system (OS) is software programming. To me, operating systems are the proverbial line drawn in the sand between hardware and software \u2013 an OS takes your instructions and etches them into the hardware. It turns software actions into hardware truth, a sort of bridge between hardware and software.

\n

What is Hardware

\n

This time I\u2019m better prepared to read out the definition of hardware! Put down your pitchforks. Ahem. Hardware is \u201cthe machines, wiring, and other physical components of a computer or other electronic system.\u201d Sound good? Hardware is all around us. I\u2019m writing this article on a Macbook Pro, I ride the bus to and from work, and I even use a microwave from time to time (a lot of the time, actually). These are all examples of hardware with software interfaces that expose an interactive API. When I click the button to reheat my pizza, the microwave sends instructions to the hardware of the microwave (turn on, use this much electricity, start spinning, turn off, etc). Hardware is everywhere around us.

\n

Why Software

\n

Now that we have some definitions out of the way, let\u2019s take a trip down memory lane. Long, long ago there were not that many computers and hardware was very expensive. One of the first popular programming languages was Fortran, which ran on the IBM 704 mainframe computer. Computers were the size of rooms back then, and this super computer packed a whopping punch. It had a jaw-dropping memory of about 18KB and weighed 10 tons. Whoo wee. We\u2019ve come a long way, haven\u2019t we? Anyway, given the cost of these computers and the amount of electricity they consumed (thousands of dollars worth an hour), it made sense that it was important to optimize for hardware time, not programmer time. See, a team at IBM had access to one of these. That means a team of smart, capable individuals would all have to share one computer. They\u2019d fuss for hours about making faster algorithms, because it mattered. A lot. Most people programmed in pure machine code in those days because a compiler couldn\u2019t come close to the amount of optimization a team of super smart people would be able to do. Clearly, hardware was the constraining factor here, not labor. And for many years that was true \u2013 interpreted languages (such as lisp) were looked over because they ran thousands of times slower than native machine code. Needless to say, most everyone programmed in machine code, and that was the way things were.

\n

But eventually, that changed. Think about Moore\u2019s law. The number of transistors in a circuit doubles every two years. With the increase in computing power of hardware, programmers were freed from having to use machine code for everything. Eventually they used assembly, and then higher level languages like C, and nowadays, extremely high level languages like Python, Ruby, and Javascript have risen to prominence. Sure, Javascript runs many orders of times slower than machine language or assembly, but with every passing year, higher level languages rise in popularity. Every year, our hardware gets better. That means every year, hardware costs less. Thus, programmers are freed from having to optimize every facet of their program. Oddly enough, programmers are incredibly cheap to hire these days. Give them a $1500 computer, two monitors, a keyboard and mouse, a chair and desk. That\u2019s about $3000 for all the hardware they need. But the company needs to pay their salary, which can range quite heavily, but it sure costs a lot more than $3000 a year, let me tell you.

\n

So the trend is quite apparent for right now. Hardware costs less and less with each passing year, whereas software costs more and more. Software took over hardware as the restricting factor in development, and ever since then, there\u2019s been a boom in software jobs and more of a glut in hardware jobs. And it doesn\u2019t seem to be stopping anytime soon. Steve Jobs famously told Barack Obama to lower barriers in immigration because Apple just couldn\u2019t hire enough talented software engineers. Salaries have skyrocketed for software developers, and it looks like they won\u2019t be falling back to earth soon, as long as companies have too much demand for software developers with far too little supply.

\n

Conclusion

\n

So the trend seems to be that we\u2019ll need more and more software engineers, and maybe hardware won\u2019t grow as fast. Even with the explosion in mobile devices in the past decade, there can be thousands of apps on one phone. That means there\u2019s one team of hardware engineers for every thousand or so teams of software engineers. So, perhaps, the direction from now on will be more software-oriented instead of hardware-oriented. Or, perhaps, there\u2019ll be another change. Maybe hardware will see a revival when we all become cyborgs in the near future. Who really knows?

\n \n\n", "date_published": "2019-04-14T20:56:58.198000+00:00" }, { "id": "gen/progressing-in-programming.html", "url": "gen/progressing-in-programming.html", "title": "Progressing in Programming", "content_html": "\n\n \n \n \n \n \n \n\n Progressing in Programming\n \n \n \n \n \n \n \n
\n

Progressing in Programming

\n

Date: 2019-04-11T20:41:20.201Z

\n
\n

Table of Contents

\n \n

I\u2019ve always wondered what the power of journaling was \u2013 I\u2019d never been the type to write all of my goals in a journal, and plan meticulously about what I was going to do \u2013 I took everything at face value and jumped on every opportunity as it came. But no longer! It\u2019s time to begin blogging about progress. We\u2019re going to make some gains. (Not of the gym kind, of course).

\n

Before we begin to do any growing, we have to know what to grow in! So we take this big goal and shrink it down into bite-sized pieces. So my goal is to become a better software developer by the end of the year. My reasoning is that understanding more about the basics of programming languages will help me out on the day to day, helping me understand what I need to do in order to become a better software developer.

\n

So then, what is there to learn? Well, too much. But I\u2019d like to improve at certain parts of programming that are ubiquitous in the field.

\n

C Programming

\n

Ah, C. The language of Unix, the programming language that runs Unix, the most influential Operating System of all time. Such a powerful language, with just 32 reserved words, you too can build your own kernel in just a few thousand lines using this wildly expressive language. So C pretty much is the de facto language for compilers and Operating Systems (since they\u2019re both built in C), and this makes it a great first language to sink my teeth into. The way I\u2019ll do it is by reading through Programming in C (4th edition), which should be coming in the mail soon. I\u2019ll be littering this blog with some tidbits I learn from that book, and summarizing the key concepts there.

\n

Operating Systems

\n

You\u2019ve noticed I\u2019ve mentioned operating systems in the section above. And of course, operating systems are invaluable to us. No one these days interacts with a computer without an operating system. So of course, learning about Operating systems should be a high priority \u2013 especially Unix based ones. For that, I\u2019ve picked up a copy of OSTEP (Operating Systems: Three Easy Pieces) to help aid in learning about virtualization (the act of taking the physical hardware and making a virtual API to interact with it, like writing to files, or editing files, spawning processes), concurrency (the act of turning this single threaded computer into one that can run multiple processes at once), and persistence (finally saving our actions to a hard disk, so they won\u2019t be overwritten on next boot up). Operating systems are considered to be a hard topic, so I\u2019m looking forward to the challenge! Maybe as a final project, it would be cool to make a small OS that could support file IO or do something like that.

\n

Compilers

\n

Next up is Compilers, that thing that compiles your code and turns it into machine language. I don\u2019t know too much about compilers (hey, I\u2019m a JavaScript(JS) guy, I use an interpreter), but I\u2019ve always been keen on compilers ever since Babel entered the scene. If you don\u2019t know what Babel is, you\u2019ll have to know a bit about front-end web development. One of the pain points of front-end development is that we, as front-end developers, have to support fairly old browsers, and they support different features of HTML, CSS, and JS. For example, IE8 supports HTML4, CSS 2.1 (that means no media queries) and ES3 (the version of JavaScript that was standardized in 1999\u2026). Currently, Google Chrome supports HTML5, CSS3, and ES10 (the version of JavaScript that was standardized in 2019). So of course, we could all painfully write our ES5 JavaScript to be backwards compatible with IE8, or we could ditch browser support for IE8 (a lot of firms have), but there\u2019s still a large user base that uses IE11, so it\u2019s a hard sell to drop support for that. IE11 supports HTML5, CSS3, and ES5, which is the 2009 standard of JavaScript. You get the idea. Lots of browsers to target, but to target them all, we\u2019d have to use some subset of JavaScript from 10 years ago. Not very fun. Many years ago, a solution named Babel was released \u2013 it takes the JavaScript that you write, and turns it into valid es5 JavaScript code. It\u2019s a compiler for JavaScript, but really, it\u2019s perhaps the best open source project out there in the JavaScript ecosystem. And to understand Babel, you have to understand compilers. Cue the JavaScript music (Babel actually has a theme song, a cover of Jeff Buckley\u2019s Hallelujah). I\u2019d like to write my own small language too, preferably in C, so I can understand the nitty gritty of creating a compiler, and getting closer to the metal, if you will.

\n

C++ Programming

\n

Ah, C++. When you program in C++, every problem looks like your thumb, and every goto leads you to a black hole. I\u2019m kidding, by the way, I don\u2019t have any experience writing C++, so I can\u2019t tell you much about the language. It\u2019s also very popular, with uses in pretty much all industries, but especially in game development. It\u2019s an excellent way to learn about Object Oriented Programming, something I\u2019m really lacking, and it has a wide standard library (again, something that JavaScript is sorely lacking), which handles plenty of use cases, such as std::vector<T, T> for lists, and std::array<T, T> for arrays. What more could you ask for?

\n

Java Programming

\n

Java is the best and worst thing that\u2019s happened to programming in the past 20 years. Any takers on that statement? Steve Yegge said it. But it\u2019s interesting, you see, because when one refers to Java, it\u2019s kind of hard to see if someone\u2019s referring to Java or the Java Virtual Machine(JVM). Java has so many people who live and die by it (in more of the metaphoric sense), it has so many zealots that swear by it and nothing else, and become exalted to memetic heights \u2013 the old timey people used to tell me about how Java saved them from environment dependent variables and worrying about memory management and pointers. And it\u2019s true that Java, specifically the JVM is amazing at abstracting away all the stuff that we really shouldn\u2019t be managing anyway, but one concern is that Java is just such a slow moving language \u2013 it was in the Sun Microsystems days, and it still kind of is in the Oracle Days. But still, Java is a key language to at least know the basics of, and I myself still admire how forward thihnking the Sun Microsystems team were in developing both the JVM and Java. Truly wonderful. A+ Language indeed.

\n

Go Programming

\n

Golang is the best programming language to come out in the past 10 years. How many takers do I have for that statement? Go reminds of C, but brought into the 21st century \u2013 it has garbage collection (yay!), it can be run or compiled (the best of both worlds), the compiler can assume what types you\u2019re using, so you can say x := 5 instead of the slightly more verbose int x = 5;, there are no semicolons (the compiler adds them at the end of each line for you), and it has a very small API, with a more modern standard library so that you can do what you need to, without having to know every nook and cranny of some arcane language (see: C++, Java, JavaScript). And it has a cute mascot. That\u2019s the real kicker, to be honest. Java\u2019s mascot, the Java Duke, looks like its from a 17th century sketchbook. Python doesn\u2019t have a mascot, just like JavaScript or C. Chalk a win up for the boys and girls at Google, their marketing is superb.

\n

Python Programming

\n

Python is one of the most loved languages these days \u2013 and I will say without a doubt, that it is an amazing language. Guido really outdid himself. But one thing I don\u2019t get is why there\u2019s a fork in the language. Why is there Python2 (which this Mac runs) and Python3 (which this blogger uses)? And the syntax is a bit different from the usual C derivative language syntax, but I guess that\u2019s a good thing about python, it has so many great things about it (such as lambdas from 1994), and comprehensions (everyone\u2019s favorite feature of functional programming), and a killer ecosystem (I instinctively etch import numpy as np for almost every .py file I write). And it\u2019s a scripting language that can automate everything from emails, to file moving, to web development, to even cars. Why don\u2019t people like scripting languages more?

\n

Data Structures and Algorithms

\n

Data Structures and Algorithms are perhaps the most sought after concept in interviews, which has led a lot of people to discount the process entirely, but you know, I don\u2019t blame tech firms for using data structures and algorithms questions to hire applicants, and here\u2019s why. Let\u2019s say you want to hire for a back-end software engineer, and your stack is in python\u2019s django, for example. Almost no one will have prior python django programming experience, so you have two choices. Either you hire and test all applicants for python django knowledge (which could take your team a year to fill those three vacancies on your team) or you could simply test on concepts that are familiar to backend development \u2013 ask them questions like \u201cwhat does MVC mean? How would you handle real time communication needs in an app?\u201d, or even easier, say you don\u2019t even have to write any code, we\u2019ll test you on ideas. And that\u2019s really the core of what the Data Structures and Algorithms technical interview does, it tests some common knowledge among all developers. So you know, it\u2019s not so bad. To get better at Data Structures and Algorithms, I\u2019ve got Elements of Programming Interviews (EPI in Python), Daily Coding Problem, and Algorithms by Skiena all lined up and ready to go. Hopefully by the end of the year I\u2019ll be a better problem solver then.

\n

Well, those are the areas I\u2019ll be covering, and maybe documenting here and there through this blog. Hopefully I\u2019ll be a better software developer, and maybe I\u2019ve encouraged you to be a better developer.

\n \n\n", "date_published": "2019-04-11T20:41:20.201000+00:00" } ] }