Writing a VM

2024-07-20T11:45:44-04:00

I promised a while back to learn compilers in earnest \u2013 the programs\nthat turn your high-level code into the low-level bits that your\ncomputer can execute.

I wrote up a few tree-walk interpreters \u2013 they would tokenize the\nprogramming language provided, parse it, and then execute the AST\nitself, executing the code.

Tree-walk interpreters work pretty well for the most part \u2013 they\u2019re\neasier to write, and have great introspection (each node has all the\ninfo it needs). But it didn\u2019t really scratch my itch of going all the\nway to the metal.

If I write an assembly file to return the number 10, I would\nwrite:

.globl main\nmain:\n mov $10, %rax\n ret

I can compile that output:

$ gcc test.s -o a.out

And objdump -d the generated file:

$ objdump -d a.out\n0000000000401106 <main>:\n  401106:   48 c7 c0 0a 00 00 00    mov    $0xa,%rax\n  40110d:   c3                      ret

So our instructions are encoded as\n48 c7 c0 0a 00 00 00 c3 as bytes.

mov $10 %rax corresponds to\n48 c7 c0 0a 00 00 00, and ret turns into\nc3.

So, our \u201chigh-level\u201d assembly that we wrote is turned into those\nbytes, and the computer can read those bytes and execute them.

A tree-walk interpreter stops before serializing and deserializing\ninstructions \u2013 it runs the instructions from in-memory, so there\u2019s no\nneed to dump the state to disk.

To be fair, interpreters have a way to dump the tree that they\u2019re\nexecuting to disk, like running this command:

$ clang -Xclang -ast-dump=json -fsyntax-only -Wno-visibility test.c\n# emitted JSON here.

So we\u2019d like to do that, by turning our instruction stream from those\nhigh level instructions into bytes, and then being able to read those\nbytes back and execute the program from binary.

The way I did was sketching out a VM with a set amount of\ninstructions. Each instruction would be the first byte (so I had a cap\nof 256 instructions), and then the following bytes would be the\narguments.

Next, I had to handle the arity of each instruction. If you have say\nmov $10, %rax and mov %rbx, %rax, these have\nto be encoded as two distinct instructions \u2013 one is an immediate to\nregister move, and the other is a register to register move, even though\nthey\u2019re both called mov in the written assembly.

So, I ended up with some 25 instructions, and each instruction would\nslurp up the bytes it wanted from a binary, or serialize to those bytes,\nand then voila, as long as you had a VM that could take those bytes and\nreturn them to the assembly instructions, you could write any program\nyou wanted.

An example program, which would print 10 and then exit would look\nlike this in assembly:

putreg 10 R0\nprintreg R0\nret

I could then encode that file as binary:

$ cargo r -q -- -e print.asm > out.bin

And look at its contents:

$ xxd out.bin\n00000000: 010a 0000 0900 00                        .......

And run it from the binary format.

cargo r -q -- -d out.bin\n10

So the program went all the way from the assembly itself, to bytes,\nand then was run properly by the VM.

If you\u2019d like to take a look at the VM code on github I\u2019ve linked it\nhere.

VM Code

Next step \u2013 writing a high level language that can emit the bytecode,\nso I can go all the way from a high-level language to the bare\nmetal.

10 More Predictions

2024-03-11T08:48:55-04:00

Five years ago, in 2019, I made 10 predictions for the next 10 years\n(so 2029).

Let\u2019s see how they turned out, halfway through in 2024.

I\u2019ll grade myself out of 10 points for each prediction based on how\ngood they\u2019re looking at the moment.

Here\u2019s the list:

Self-driving cars will still be two years out

This one is still true. By self-driving, I meant take a nap as the\ndriver and leave your life in the hands of your car. Two years out was a\njab at some indefinite future, which I think is also true \u2013 people are\nstill gunning for self-driving cars, but it\u2019ll most likely take longer\nthan expected. This prediction is looking good, we\u2019ll see in 2029.

Verdict: 10 points.

Rust will become a top 10 language in popularity

By 2019, Rust was the most admired language on stack overflow\u2019s\ndeveloper survey for four years. In 2024, it\u2019s won that title nine times\nin a row. No easy feat, given the amount of programming languages in\ncommon use.

Currently, Rust is the 17th most popular language according to the\nTIOBE index, and it has quite a ways to go to crack the top 10. But it\u2019s\ngot momentum for sure.

Verdict: 5 points.

WebAssembly will kill JavaScript and Desktop Apps

In 2019, webassembly was hyped as freeing us from having to write our\nfront-ends in javascript, but also bringing the browser sandbox with\nextra security to the desktop. It had a lot of hype. Five years later,\nhowever, there hasn\u2019t been much progress. Maybe something will change,\nbut this one is not looking good \u2013 few people have replaced their\nelectron usage with WASM sandboxed apps, and wasm is still struggling by\ncommittee. We\u2019ll see.

Verdict: 2 points.

JSON will be replaced with a Typed Transfer Protocol

This is the prediction I regret the most. First of all, protobuf was\nalready somewhat popular at the time, and there were other protocols\nthat were typed transfer protocols. On the other hand, it\u2019s gutsy. JSON\nhad been the interface between the frontend and backend since XHR\nrequests, so seeing it replaced with something else like it replaced XML\nwould be an interesting bet. Regardless, this one isn\u2019t looking good\neither.

Verdict: 0 points.

Functional Programming will finally become popular

By 2019, most languages had adopted functional programming features \u2013\nRust was a popular language that married idioms that allowed for\nlow-level control and functional programming in a tasteful way, Swift,\nKotlin, and Typescript have all improved mobile and web development with\ntheir adoption of functional programming features as well. But I meant\nthis prediction to be: We\u2019d all be using languages that were FP first,\nlike Elixir, Clojure, OCaml, Haskell, etc. It seems like programming\nlanguages are adopting FP features (as they always have) but the\ncommunity at large is not using FP languages for development (more than\nthey used to, at least), so this one is also an easy grade.

Verdict: 0 points.

Microservice hype will wear off

In 2019, at the hype of low interest rates, hiring became widespread\nand microservice architecture spread like wildfire. At some size of\ncompany, I think it\u2019s inevitable to have lots of services, but that\ndoesn\u2019t mean every team gets its own set of services, like microservices\nimplies. In 2024, with interest rates being closer to 4%, companies have\ncut hiring and the microservice fad has gone down in hype, with\ncompanies favoring more monolithic approaches to web development. I\u2019m\nnot sure if it\u2019ll continue on (this is like betting that interest rates\nwill stay at 4%+ until 2029) but since the hype has worn off once, I\u2019m\nassuming it could do so again.

Verdict: 8 points.

Facebook will no longer be a top 10 company

(I should\u2019ve been more specific, in my mind I was thinking top 10\nAmerican company by market cap)

In 2019, some people (myself included) were seeing Facebook as a one\ntrick pony. They had three services (Facebook, WhatsApp, Instagram), of\nwhich they bought two, that were overindexed in social media (which was\nwaning in popularity at the time), and only relied on ad revenue.\nBasically, if social media becomes less popular, then Facebook will go\ndown.

Zuckerberg understood this pretty well, so in 2021, Facebook decided\nto change its name to Meta and go all-in on VR technology. This failed\npretty miserably, with Meta\u2019s stock price tanking in response. It\u2019s back\nnow, but Meta is currently 6th in market cap for American companies and\ncould be pushed out from the top 10 by some upstarts. We\u2019ll see.

Verdict: 5 points.

An AI startup will become this decade\u2019s hottest startup

In 2019, AI was not as hyped as it was, but in 2022, when OpenAI\nreleased Chatgpt, things changed forever. In my opinion, OpenAI\ncurrently holds the title of \u201chottest startup\u201d, so I think this one was\nright, within 3 years.

Verdict: 10 points.

Blockchain will be this decade\u2019s beanie babies

Basically every blockchain outside of Bitcoin and Ethereum are\ntotally useless, just as they were in 2019. So, they haven\u2019t really\nbecome more popular or less popular in 5 years. Easy grade.

Verdict: 0 points.

You and I will host our apps on a new cloud provider (read. not\nAWS)

I still hate configuring AWS, but with Azure and GCP fumbling the\nball to try and take market share, and AWS releasing more fundamental\nservices, AWS has maintained its position as market leader. I don\u2019t see\nmyself (or a lot of other big companies) hosting their software on any\nother cloud. This one is also easy to score.

Verdict: 0 points.

How fast are computers these days

2024-01-16T08:50:36-05:00

\n\n

I\u2019ve been doing a lot of reading about distributed systems, but I\u2019ve\nnever known how fast hardware goes these days. Some common advice like\n\u201cI/O is slow\u201d or \u201csystem calls are slow\u201d might not matter that much,\ndepending on the performance requirements of the system that\u2019s using\nthem.

General performance numbers are useful for building distributed\nsystems \u2013 let\u2019s say you have a system with 1PB of data and you want to\nbackup all that data to one node, where you can write to disk at 1GB/s.\nHow long does it take to back up the whole node? That would take about\n10 days to complete (1PB / 1GB) is 1M, and 1M seconds is about 10 days.\nIf we were using HDDs from 20 years ago instead of SSDs, where writes\nmight be closer to 40MB/s, it would take closer to 250 days, which is\nreally crazy.

Armed with this information, we might say a total outage cannot\nhappen, or we carefully design the system to partition the data so\nbackups can happen in parallel, where if we have 1000 nodes, then our 10\nday backup might take 10 / 1000 days, or closer to 90 seconds.

For a twist, I decided to benchmark my phone (a Samsung S10) vs my\npersonal laptop (a Framework laptop with a 12th gen intel i5-1240p\nprocessor, 32GB of DDR4-3200 RAM, and a 2TB Western Digital SN750). The\nS10 has a value of about $200 used these days, and the laptop about\n$1000. So, we can also see how much money it would cost to build a\nsystem using these parts (although I assume building a system with NUCs\nor the like would be much more cost efficient), we can get a ballpark\nestimation of how much certain distributed systems would cost to build.\nAs well, I\u2019ll put down my guesses for how fast I thought each benchmark\nwould be for my phone and my laptop, and see how far off I was. In the\nend, we\u2019ll discuss some in the wild systems and see how feasible it\nwould be to build them.

HTTP Servers

The communication layer of a service might use HTTP. Thus, I wanted\nto benchmark a \u201chello world\u201d HTTP service to see how fast a networked\nservice could run, given it does no work.

I assumed my laptop would run about 100k req/s, with around 10\nmicrosecond latency, and my phone would do around 10k req/s, with 100\nmicrosecond latency.

Try to guess how fast each device was:

The computer had these results:

Running 5s test @ http://localhost:3000\n  256 goroutine(s) running concurrently\n975712 requests in 4.929485946s, 105.15MB read\nRequests/sec:       197933.82\nTransfer/sec:       21.33MB\nAvg Req Time:       1.293361ms\nFastest Request:    21.095\u00b5s\nSlowest Request:    67.116949ms\nNumber of Errors:   0

And the phone:

Running 5s test @ http://localhost:3000\n  32 goroutine(s) running concurrently\n107649 requests in 4.909044953s, 11.29MB read\nRequests/sec:       21928.71\nTransfer/sec:       2.30MB\nAvg Req Time:       1.459274ms\nFastest Request:    85.677\u00b5s\nSlowest Request:    48.922239ms\nNumber of Errors:   0

The laptop had about 200k req/s and the phone had about 20k req/s.\nBoth were about twice as fast as I expected.

So, if we had to create a \u201cHello world\u201d server and reach 1M req/s, it\nwould take 50 phones or 5 laptops. The 50 phones would cost $10000, and\nthe 5 laptops would cost about $5000. Computers are cheaper.

Redis

What if we put a cache in front of our servers? How fast could it\nrespond?

Since redis is single-threaded and in-memory, it should have similar\nnumbers to the HTTP server.

I would assume that it would perform the same as the HTTP servers,\nsince they\u2019d be doing about the same amount of work: ~200k req/s for the\nlaptop and ~20k req/s for the phone.

The laptop ran at ~200k for read, write, ping.

The phone ran at ~40k req/s for read, write, ping.

Pretty good \u2013 if we could hold our dataset in memory, we could\nrespond with 200k req/s.

Sqlite

We\u2019ll need a database to store our data. Let\u2019s say we use sqlite, and\nbenchmark it, using row sizes of 1000B.

I assume that we can write about 50k/second rows on the laptop and\n10k rows/second on the phone.

The laptop:

Batching 1000 writes at a time:

$ ./sqlite-bench -batch-count 1000 -batch-size 1000 -row-size 1000 -journal-mode wal -synchronous normal ./bench.db\n\nInserts:   1000000 rows\nElapsed:   7.824s\nRate:      127817.265 insert/sec\nFile size: 1026584576 bytes

Writing 1 row at a time:

$ ./sqlite-bench -batch-count 1000000 -batch-size 1 -row-size 1000 -journal-mode wal -synchronous normal ./bench.db\n\nInserts:   1000000 rows\nElapsed:   43.839s\nRate:      22810.910 insert/sec\nFile size: 1026584576 bytes

The phone:

Batching 1000 writes at a time:

$ ./sqlite-bench -batch-count 1000 -batch-size 1000 -row-size 1000 -journal-mode wal -synchronous normal ./bench.db\n\nInserts: 1000000 rows\nElapsed: 66.006s\nRate: 15150.53 insert/sec\nFile size: 1026584576 bytes

Writing 1 row at a time:

$ ./sqlite-bench -batch-count 1000000 -batch-size 1 -row-size 1000 -journal-mode wal -synchronous normal ./bench.db\n\nInserts:   1000000 rows\nElapsed:   200.473s\nRate:      4884.369 insert/sec\nFile size: 1026584576 bytes

Here we see the phone\u2019s weakness \u2013 its disk. Inserting row by row,\nthe phone only can write 5k rows/s, whereas the computer can do about\n20k rows/s, probably due to the phone having a flash disk and the\ncomputer having an SSD.

Disk

Since the sqlite bench uncovered the weakness of the phone\u2019s disk,\nlets test out some numbers for the file system:

I decided to benchmark sequential reads + writes for both and with\nfsync for the writes. Since I used fio as my tool, and it\nsupports io_uring, I gave that a shot on my laptop. Android\ndoesn\u2019t support io_uring for security reasons, so I only\nran io_uring on the computer.

I\u2019d expect about 4GB/s read and 3GB/s write on the computer (since\nthat\u2019s what the SSD is rated at) and maybe 1GB/s read and 200MB/s write\nfor the phone?

The numbers looked like this, with a blocksize of 1MB and running 8\njobs in parallel, which seemed to be the best, throughput wise:

Computer:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

job name	p50	p90	p99	p99.99	throughput
sync sequential read	113\u03bcs	5014\u03bcs	33424\u03bcs	37487\u03bcs	3387MB/s
sync sequential write - fsync	169\u03bcs	529\u03bcs	1319\u03bcs	333448\u03bcs	2830MB/s
sync sequential write + fsync	775\u03bcs	1090\u03bcs	1369\u03bcs	2147\u03bcs	1714MB/s
sync readwrite - fsync	204\u03bcs	416\u03bcs	865\u03bcs	14222\u03bcs	Read: 2556MB/s Write: 2665MB/s
sync readwrite + fsync	416\u03bcs	832\u03bcs	2769\u03bcs	9372\u03bcs	Read: 1052MB/s Write: 1093MB/s

Phone:

With a blocksize of 256k and running 8 jobs in parallel:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

job name	p50	p90	p99	p99.99	throughput
sync sequential read	1549\u03bcs	4490\u03bcs	4752\u03bcs	22152\u03bcs	866MB/s
sync sequential write - fsync	18482\u03bcs	39584\u03bcs	89654\u03bcs	109577\u03bcs	100MB/s
sync sequential write + fsync	510\u03bcs	865\u03bcs	1729\u03bcs	1860\u03bcs	86.3MB/s
sync readwrite - fsync	314\u03bcs	1926\u03bcs	41681\u03bcs	41681\u03bcs	Read: 77.4MB/s Write: 75.5MB/s
sync readwrite + fsync	379\u03bcs	717\u03bcs	10421\u03bcs	10421\u03bcs	Read: 38.5MB/s Write: 41.4MB/s

The phone was substantially slower than I expected.

Hashing

I was expecting about 400MB/s hashing on the laptop and about 100MB/s\nhashing on the phone for a non-cryptographic hash, and ~40MB/s for a\ncryptographic hash on a laptop, and 10MB/s on a phone.

Laptop:

sha3-256: 56.14 MiB/sec
md5: 404.15 MiB/sec
sha1: 432.36 MiB/sec
xxhash: 1827.80 MiB/sec
murmur3: 1826.07 MiB/sec
jhash: 1542.84 MiB/sec
fnv: 3720.28 MiB/sec
crc32c: 4682.73 MiB/sec

Phone:

sha3-256: 217.77 MiB/sec
md5: 391.14 MiB/sec
sha1: 334.32 MiB/sec
xxhash: 2037.18 MiB/sec
murmur3: 1328.81 MiB/sec
jhash: 1754.98 MiB/sec
fnv: 2928.83 MiB/sec
crc32c: 14255.49 MiB/sec

I could not have been more wrong.

The phone hashes very quickly compared to the laptop, maybe because\nof some hardware instructions?

Networks

The Computer can support 2.5Gbit/s ethernet, and the phone can\nsupport 2000Mb/s DL and 316Mb/s UL, which is pretty fast, even faster\nthan disk, so network probably won\u2019t be the bottleneck.

Talking about Systems

With some numbers to work with, let\u2019s talk about some famous products\nand their requirements.

Google Search

In 2006, according to Source,\nthe google search engine contained about 850TB of information.

Assuming google search needs to handle 100k requests/second, and each\nrequest would return 4KB of data, our bandwidth requirement would be\n400MB/s. That\u2019s feasible to handle on a few laptops.

Assuming that we didn\u2019t have an index at all. We would need to\nsomehow read 850TB of data per request \u2013 even with a sequential read\nspeed of ~3GB/s, each request would take 3 days of compute time to\ncomplete. Since we have to handle 100k reads a second, and each request\ntakes 3 days of compute time, in one second we would need to spend 866\nyears of compute time to serve reads. One second of requests would also\nrequire 100,000 computers * the amount of seconds in three days, or\nabout 250,000, for 2.5B computers required to serve google search. At\n$800/computer, this would be $2.5T, 1/10th the GDP of the US. Crazy.

However, if the computer only needed to search a gigabyte of data on\ndisk to fetch a result, since a laptop\u2019s SSD can read about 3GB/s\nsequentially, even having to search on disk for 100MB would take 30ms.\nEven worse would be the computer requirement \u2013 each machine would only\nbe able to handle 30 req/s, so 3,000 laptops would be required at any\ngiven time to serve all search requests.

If we can lower the seeking on disk to only 1MB, we could handle\n3,000 req/s, and only require 30 laptops \u2013 so having a fast index\nmatters a lot. It would be even better if we could search in RAM, where\nmemory throughput is closer to 25GB/s. If the entire index was in-memory\nand needed to read 1MB of data, each laptop would be able to handle\n25,000 req/s, and we\u2019d only need 4 laptops.

Amazon S3

According to Source\nAmazon S3 holds 280 trillion objects, and handles about 100 million\nreq/s. S3 sends 125 billion events per day (1.25M req/s), and S3 handles\n4 billion checksum computations per second.

Assuming that\u2019s 80M reads and 20M writes, each of 1MB, that would\ninvolve 80TB/s of reads, and 20TB/s of writes. For the disk usage alone,\nyou\u2019d need ~7,000 laptops to handle the writes, and ~27,000 laptops to\nhandle the reads every second. But assuming we use 2.5Gb internet, at\n~300MB/s, we\u2019d need 10 times that amount, or 70,000 laptops to handle\nthe writes and 270,000 laptops for the reads.

For the hashing, assuming each hash is over 1MB of data, would\nrequire hashing 4PB/s of data. Using xxhash at 2GB/s would require\n2,000,000 laptops to just hash the data. Servers can hash xxhash data at\nover 100GB/s, so I assume amazon uses those to reduce the computer\nrequirement to a modest less than 400,000.

Conclusion

While going through this I learned a lot about how fast my computers\nare \u2013 they can handle hosting lots of services. Also, computers are\nreally cheap \u2013 an 8GB RAM, 80GB SSD instance on hetzner is ~$5/month,\nand the equivalent NUC would cost ~$100 to own, and these can handle\nthousands of requests per second at the very least, more than enough for\nthe large majority of services.

Writing a Small Search Engine

2023-07-27T07:20:52-04:00

\n\n

Today we\u2019re going to write a small search engine in ~70 lines of\ncode:

$ cloc --quiet src/main.rs\n\ngithub.com/AlDanial/cloc v 1.90  T=0.00 s (271.8 files/s, 22288.4 lines/s)\n-------------------------------------------------------------------------------\nLanguage                     files          blank        comment           code\n-------------------------------------------------------------------------------\nRust                             1             12              1             69\n-------------------------------------------------------------------------------

The implementation leans on using a few dependencies from crates.io,\nso let\u2019s fetch those first.

Initialize a new rust project:

cargo new tinysearch

cargo add anyhow\ncargo add bincode\ncargo add glob\ncargo add patricia_tree --features=serde\ncargo add serde --features=derive

With that, your Cargo.toml should have this for its dependencies.

[dependencies]\nanyhow = "1.0.70"\nbincode = "1.3.3"\nglob = "0.3.1"\npatricia_tree = { version = "0.5.7", features = ["serde"] }\nserde = { version = "1.0.160", features = ["derive"] }

Anyhow is used as box<dyn Error>, so it\u2019s not\nreally necessary, but it\u2019s nice to not have to write.

Bincode is used for serialization and deserialization to disk for\ndata structures. We\u2019ll need this to store the search index on disk and\nfetch it when we do a search.

Glob is for making glob queries, like src/**/*.rs to\nfetch all rust files recursively under the src\ndirectory.

Patricia tree is a trie data structure, that has the same ADT as a\nmap or set. Since this is a trie, and not a hashmap or btree, if there\nare redundant prefixes, they are compressed on disk. Since we\u2019re\nserializing and deserializing text, which tends to be somewhat\nredundant, this potentially saves a lot of memory compared to using a\nhash-based or normal tree-based structure.

Serde is used to serialize the tree onto disk and deserialize it, for\nbincode.

Index Implementation

Given the explanation above, your mental model to populate the index\nshould be something like this:

Initialize Trie
Fill Trie with some data (how?)
Save Trie to disk

And to read from it:

Read the file containing Trie to disk
Deserialize it back to a Trie
Query the trie

The open question is what data we fill the Trie with. Obviously it\nwould be something from our dataset, but if we make a mapping from each\nword to its document name, then this works, but we can only query one\nword at a time.

With this index, a query like \u201cice cream\u201d would only be able to query\nfor either \u201cice\u201d or \u201ccream\u201d.

To keep it simple, we\u2019ll choose an indexing strategy referred to as\nn-grams. N-grams indexes windows of length n.\u00a0Our previous strategy of\nindexing every word could be considered a 1-gram. We\u2019ll go with 5, since\nthat allows us to query up to 5 words. I queried my search history and\nfound that 5 word queries just about covers my querying needs.

Let\u2019s get coding:

in src/main.rs, add the imports required:

use std::{\n    collections::{BTreeSet, HashSet},\n    fs::{self, File},\n    io::{BufRead, BufReader},\n};\n\nuse patricia_tree::PatriciaMap;\n\nuse anyhow::Result;\nuse glob::glob;

Next, let\u2019s create the cli flags with the two commands we\u2019ll support,\nsearch and index.

fn main() -> Result<()> {\n    use std::env::args;\n    match args().len() {\n        0 => eprintln!("Please provide a top level command of search or index"),\n        _ => {\n            let arguments: Vec<_> = args().collect();\n            if args().len() > 2 && arguments[1] == "index" {\n                let _ = index(&arguments[2]);\n            } else if args().len() > 2 && arguments[1] == "search" {\n                let _ = search(&arguments[2..]);\n            } else {\n                eprintln!("Please provide a top level command of search or index");\n            }\n        }\n    }\n    Ok(())\n}

Next, the function definition and matching patterns based on it.

This code matches a regex provided (like \u201c**/*.rs\u201d), and then returns\nall matching files. We then iterate through the files and process each\nline.

fn index(pattern: &str) -> Result<()> {\n    let mut mytrie = PatriciaMap::default();\n\n    for entry in glob(pattern)? {\n        let path = entry?;\n        let path = path.to_string_lossy().to_string();\n\n        let f = File::open(&*path)?;\n        let f = BufReader::new(f);\n\n        for line in f.lines() {\n            // process each line, the next step\n        }\n    }\n}

Next, we want to do something for each line. First, split each line\non whitespace to get a list of words. A production worthy search engine\nwould do stemming and remove unnecessary punctuation here, but we won\u2019t\nworry about that. Finally, we\u2019ll use the windows\nfunction on slices, to return arrays with the length provided (5) across\nthe entire list. This is n-grams in a nutshell.

let line = line?;\nlet s: String = line.to_string();\nlet words: Vec<String> = s.split_whitespace().map(|s| s.to_owned()).collect();\nlet word_windows: Vec<String> = words.windows(5).map(|w| w.join(" ")).collect();

With that, the next thing to do is to put each five gram as a key to\nthe trie, with a value of the path of the document. If that fivegram\nalready exists, we just add the new path to the list of paths it\nmatches.

for fivegram in &word_windows {\n    if !mytrie.contains_key(fivegram) {\n        let mut set = BTreeSet::default();\n        set.insert(path.clone());\n        mytrie.insert(fivegram, set);\n    } else {\n        // we know that the trie contains fivegram, so unwrapping is safe.\n        let mut set: BTreeSet<String> = mytrie.get(fivegram).unwrap().to_owned();\n        set.insert(path.clone());\n        mytrie.insert(fivegram, set);\n    }\n}

Finally, we serialize it and write it to a file:

let encoded = bincode::serialize(&mytrie)?;\nfs::write("./data.index", encoded)?;\nOk(())

Searching our Index

Fortunately, searching is easier. We take our list of search words,\nand then index into the map. Then, we grab the matches, and we append\nthem to all matching paths. At the end, we print them out.

fn search(search_words: &[String]) -> Result<()> {\n    let needle: Vec<u8> = {\n        let joined = search_words.join(" ");\n        joined.bytes().collect()\n    };\n\n    let index_file = File::open("./data.index")?;\n    let decoded: PatriciaMap<BTreeSet<String>> = bincode::deserialize_from(index_file)?;\n\n    let matches: Vec<_> = decoded.iter_prefix(&needle).collect();\n\n    let mut paths: HashSet<_> = HashSet::default();\n    for (_key, val) in &matches {\n        paths.extend(val.iter());\n    }\n\n    println!("{:#?}", paths);\n    Ok(())\n}

And with that, we\u2019re done. A simple search engine.

Giving it a test run

I indexed my notes:

cargo r -q -- index "data/**/*.md"

And then searched for mentions of distributed systems:

cargo r -q -- search distributed system\n{\n    "data/books/system-design-interview-an-insiders-guide-volume-2/distributed-message-queue.md",\n    "data/books/designing-data-intensive-applications/distributed-systems-trouble.md",\n}

Not so bad for 70 lines of code.

Offline Rust

2023-05-23T18:29:34-04:00

Rust, and its respective package manager, Cargo, easily allow for\nprogramming on the go. You can use sccache to cache\ndependencies you\u2019ve already downloaded, so if you\u2019ve already downloaded\nthem, there\u2019s no need to redownload them for any other project you use\nthe same dependency for. Sadly, that stops short when you want to\ndownload a new dependency you\u2019ve never downloaded before. If you\u2019re\noffline and don\u2019t have that exact dependency cached, you\u2019re SOL. But we\ncan do better, can\u2019t we?

You probably already know that crates.io is a thin layer\nover git. So, you can grep all the crates on it, download their source\ncode, and host your own local repository that acts as a stand-in for\ncrates.io. Luckily (if you have ~150GB to spare), there\u2019s already a\nproject that has done that for us: panamax. After you\ninstall it with cargo (b)install panamax, and initialize\nsome directory. I initialized one at ~/crates with\npanamax init ~/crates.

You\u2019ll want to set your config up: I didn\u2019t want to clone any\ntoolchains since I didn\u2019t need them, so I only cloned down the crates: I\nedited my mirror.toml inside the initialized repository to\nbe the following:

[mirror]\nretries = 5\n[rustup]\nsync = false\n[crates]\nsync = true\ndownload_threads = 64\nsource = "https://crates.io/api/v1/crates"\nsource_index = "https://github.com/rust-lang/crates.io-index"\nbase_url = "http://localhost:27428/crates"

And then I ran panamax sync ~/crates and waited for a\nday for the internet to download all the crates.

Once that\u2019s done, we have all the crates we need. Add these lines to\nyour cargo config (normally at ~/.cargo/config.toml) to use\nthe new repository:

[source.panamax-sparse]\nregistry = "sparse+http://localhost:27428/index/"\n\n[source.crates-io]\nreplace-with = "panamax-sparse"

Start up the server with\npanamax serve ~/crates --port=27428 and you\u2019re ready to\ncode offline in rust.

For maintenance, I sync crates with\npanamax sync ~/crates once a week with systemd\nand that works for me.

Writing Your own Libc in Rust

2023-05-22T08:58:02-04:00

\n\n

A few posts ago, we wrote our own libc in C. There was some inline\nassembly required to call functions. Lots of languages can call\nassembly, but since I mainly use rust, I decided to rewrite most of it\nin rust, since there are some nice advantages.

First: cfg definitions are a lot easier to remember. I\ncan never remember if the #ifdef for linux is\n__LINUX__ or linux or __linux__,\nor the #ifdefs for other platforms. Apple\u2019s is also odd\n(__APPLE__), and there are other #ifdefs for\ntargets: TARGET_OS_IPHONE\nTARGET_IPHONE_SIMULATOR TARGET_OS_MAC, and\nwindows with _WIN_32 and _WIN_64. With\nandroid, there\u2019s __ANDROID__ and\n__ANDROID_API__ as well. Getting tired? Well, there\u2019s also\nall the architecture related ones, which there are hundreds of, and\nthey\u2019re slightly different per compiler, so you have to know which\ncompiler you\u2019re using to even define macros.

There are 3 main wrappers around cfgs which are easy to wrap your\nhead around. not, which is anything that doesn\u2019t fit inside\nthe definition, like #[cfg(not(target_arch = \"x86_64\")))]\nmeans anything that isn\u2019t x86_64. There\u2019s any,\nwhich means for anything that matches an item in the list, like\n#[cfg(any(target_arch = \"x86_64\", target_arch = \"i686\"))]\nfor x86_64 or i686. There\u2019s all,\nwhich means that all items must match, like\n#[cfg(all(target_arch = \"aarch64\", target_os = \"linux\"))]\nmeans to only run on aarch64 linux.

Second: the inline asm syntax is much better. You have\nthree choices: global_asm!, which lets you write code\nanywhere, like if you\u2019d like to embed a string into your binary\u2019s text\nsection, asm!, which goes in the code section, and\nllvm_asm!, which is for llvm specific asm. You don\u2019t have\nto specify clobbers on x86_64, so the relevant x86_64\nsyscall code from C:

asm volatile (                                                    \\\n    "syscall\\n"                                                   \\\n    : "0"(_num)                                                   \\\n    : "rcx", "r11", "memory", "cc"                                \\\n);

In rust would look like this:

asm!(\n "syscall",\n in("rax") $nr\n);

Anyway, let\u2019s get started.

Implementation

Create a new rust binary, and call it whatever you like. I called\nmine syscalls.

cargo new syscalls\ncd syscalls

Open up src/main.rs and start off with importing the\nstandard assembly library.

use std::arch::asm;

Next, since we\u2019ll be supporting linux, with\nx86_64 Also known as amd64. Go\ncalls it this due to amd coming up with it, whereas intel popularized\nit, calling it x86_64.
\n
\n and aarch64 Also known as ARM64. Apple uses\nARM64 whereas others use aarch64.
\n
\n, we can force every other architecture/OS mix to have a\ncompiler error, so no one will miscompile and have a runtime error.

#[cfg(not(all(\n    target_os = "linux",\n    any(target_arch = "x86_64", target_arch = "aarch64")\n)))]\ncompile_error!("Only works on linux on aarch64 or x86_64");

This is really helpful \u2013 no more running a library and wondering what\nwent wrong at runtime.

So, lets start off with the skeleton of the first syscall function,\nsyscall0. We\u2019ll generate a function with the name of the\nsyscall, and the syscall\u2019s number. We\u2019ll make a compiler error to start\noff with since we haven\u2019t implemented anything yet.

macro_rules! syscall0 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name() {\n            unsafe {\n                compile_error!("not implemented");\n            }\n        }\n    };\n}

ARM64

So we\u2019ll start implementing syscalls in ARM64 first.

Let\u2019s look at an example of Hello World in ARM64 to familiarize\nourselves with syscalls in ARM64:

Taken from https://peterdn.com/post/2020/08/22/hello-world-in-arm64-assembly/

.data\n\n/* Data segment: define our message string and calculate its length. */\nmsg:\n    .ascii        "Hello, ARM64!\\n"\nlen = . - msg\n\n.text\n\n/* Our application's entry point. */\n.globl _start\n_start:\n    /* syscall write(int fd, const void *buf, size_t count) */\n    mov     x0, #1      /* fd := STDOUT_FILENO */\n    ldr     x1, =msg    /* buf := msg */\n    ldr     x2, =len    /* count := len */\n    mov     w8, #64     /* write is syscall #64 */\n    svc     #0          /* invoke syscall */\n\n    /* syscall exit(int status) */\n    mov     x0, #0      /* status := 0 */\n    mov     w8, #93     /* exit is syscall #93 */\n    svc     #0          /* invoke syscall */

To write, we first set x0 to the number 1\n(#1), to set our fd to stdout. Then, we move\nthe message to x1, which is write\u2019s second argument, Then\nwe move the len to x2, which is write\u2019s third argument,\nThen we move the number 64 to w8, which is the syscall\nnumber, And then we invoke the syscall with svc and the\nnumber 0.

We do something similar for exit, just without moving any arguments\nto x1 or x2.

Let\u2019s do that for our first syscall:

#[cfg(target_arch = "aarch64")]\nasm!(\n "mov x0, #0",\n "svc #0",\n in("w8") $nr\n);

with in(\"w8\") $nr, we can pass in our system call\nnumber, represented by $nr, and rust will put it into\nw8 for us. This is equivalent to mov w8 =$nr,\nbut we don\u2019t have to remember that syntax, as the rust compiler will\ngenerate it for us.

As well, we\u2019ll set the compiler errors for all architectures that\naren\u2019t aarch64 for now.

We repeat the following for the next 6 system calls, with\nx0-x5 being used as registers to pass in arguments.

macro_rules! syscall1 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\nmacro_rules! syscall2 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall3 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall4 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>, arg4: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                    in("x3") arg4.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall5 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>, arg4: impl Into<usize>, arg5: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                    in("x3") arg4.into(),\n                    in("x4") arg5.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall6 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>, arg4: impl Into<usize>, arg5: impl Into<usize>, arg6: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                    in("x3") arg4.into(),\n                    in("x4") arg5.into(),\n                    in("x5") arg6.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}

Note that the functions take impl Into<usize>, and\nthen the args are converted in the body of the function. That means that\nthe caller doesn\u2019t have to as usize or\ntry_into().unwrap() if they don\u2019t pass in a\nusize, which is nice, as long as the argument is\nconvertable to a usize.

Finally, we\u2019re ready to implement some system calls in ARM64!

exit takes 0 arguments and has a syscall number of 93,\nso we use syscall0! thusly:

#[cfg(target_arch = "aarch64")]\nsyscall0!(exit, 93);

And write takes 3 arguments, the fd, a string, and a\nlength, and it has a syscall number of 64, so we pass it in:

#[cfg(target_arch = "aarch64")]\nsyscall3!(write, 64);

And finally, we can write hello world:

fn main() {\n    #[cfg(target_arch = "aarch64")]\n    let string = "Hello ARM64\\n";\n\n    let ptr = string.as_ptr() as usize;\n    let len = string.len();\n    write(1usize, ptr, len);\n    exit();\n}

cargo run your file to see Hello ARM64 in\nall its glory flash onto the screen.

Now we\u2019re not done yet \u2013 let\u2019s do the same for x86_64!

x86_64

So for x86, rax takes in the system call\nnumber, and then the registers are the following: rdi,\nrsi, rdx, r10, r9,\nr8.

So now we add in those to our syscall macros:

macro_rules! syscall0 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name() {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    };\n}\n\nmacro_rules! syscall1 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr,\n                    in("rdi") arg1.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall2 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr,\n                    in("rdi") arg1.into(),\n                    in("rsi") arg2.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall3 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr,\n                    in("rdi") arg1.into(),\n                    in("rsi") arg2.into(),\n                    in("rdx") arg3.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall4 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>, arg4: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                    in("x3") arg4.into(),\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr,\n                    in("rdi") arg1.into(),\n                    in("rsi") arg2.into(),\n                    in("rdx") arg3.into(),\n                    in("r10") arg4.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall5 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>, arg4: impl Into<usize>, arg5: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                    in("x3") arg4.into(),\n                    in("x4") arg5.into(),\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr,\n                    in("rdi") arg1.into(),\n                    in("rsi") arg2.into(),\n                    in("rdx") arg3.into(),\n                    in("r10") arg4.into(),\n                    in("r9") arg5.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}\n\nmacro_rules! syscall6 {\n    ($name:ident, $nr:expr) => {\n        extern "C" fn $name(arg1: impl Into<usize>, arg2: impl Into<usize>, arg3: impl Into<usize>, arg4: impl Into<usize>, arg5: impl Into<usize>, arg6: impl Into<usize>) {\n            unsafe {\n                #[cfg(target_arch = "aarch64")]\n                asm!(\n                    "mov x0, #0",\n                    "svc #0",\n                    in("w8") $nr,\n                    in("x0") arg1.into(),\n                    in("x1") arg2.into(),\n                    in("x2") arg3.into(),\n                    in("x3") arg4.into(),\n                    in("x4") arg5.into(),\n                    in("x5") arg6.into(),\n                );\n                #[cfg(target_arch = "x86_64")]\n                asm!(\n                    "syscall",\n                    in("rax") $nr,\n                    in("rdi") arg1.into(),\n                    in("rsi") arg2.into(),\n                    in("rdx") arg3.into(),\n                    in("r10") arg4.into(),\n                    in("r9") arg5.into(),\n                    in("r8") arg6.into(),\n                );\n                #[cfg(not(any(target_arch = "aarch64", target_arch = "x86_64")))]\n                compile_error!("not implemented");\n            }\n        }\n    }\n}

Finally, we\u2019ll implement the system calls:

#[cfg(target_arch = "x86_64")]\nsyscall0!(exit, 60);\n#[cfg(target_arch = "x86_64")]\nsyscall3!(write, 1);\n\nfn main() {\n    #[cfg(target_arch = "x86_64")]\n    let string = "Hello x86\\n";\n\n    // the same as before\n}

And this time, if run on an x86_64 linux machine, you\nshould see the following when running: Hello x86.

Conclusions

That wasn\u2019t so bad, just like the last blog post \u2013 but it was also\nmuch easier, and you wouldn\u2019t have to remember the magic defines that\nare compiler dependent. As well, cfg attributes are\nextremely powerful \u2013 much better than C defines because they\u2019re caught\nfor you at compile time, and there are a bunch of useful ones already\npredefined.

Coding on Android

2023-05-15T22:33:07-04:00

I like to tinker. No surprises there. But when my tinkering made it\nso my laptop wouldn\u2019t boot, I realized I\u2019d need a plan B computer for\nwhen my laptop was out of business. The solution? My android phone.

I have a Samsung S10 phone, which is about 4 years old at this point\nand gets by just fine \u2013 it has 8GB of RAM, with 8 cores, 2.84GHz\nprocessor, and 128GB of disk space. The computer I used growing up had a\n200MHz processor, with 64MB of RAM, and a 2GB hard drive. Computers are\nreally fast these days, and I was wondering how fast my phone would be\nas a general computing device.

Thankfully, the people at F-droid and Termux made that easy \u2013 F-droid\nis an alternative app store for android, which comes with lots of\ngoodies, including Termux, a unix terminal for your android that doesn\u2019t\nrequire root. I installed both and was off to the races. Termux allows\nyou to create a symlink to your normal files, so you can still fetch\nthem from the cli. I also used tailscale to quickly move files between\nmy computer and my phone, so I could get my ssh keys, and other config\nfiles quickly onto the device.

I then got myself a c compiler and rust compiler and started writing\ncode. My target triplet for rust is aarch64-linux-android,\nwhich is in tier-2 support for rust, so there would be some rough edges.\nThere certainly were.

I tried to compile some command line utilities I had wrote for\nmyself, like host, dig, strace,\nand others. The DNS library that the network requests relied upon had a\nbug that made them not compile on android. As well,\ncargo binstall, a utility which downloads binaries of rust\ncode for you, would crash at runtime due to a DNS misconfiguration. I\nhad to recompile the library with a feature flag turned off. But\nignoring all those papercuts, it was mindblowing that a 4 year old phone\nthat costs ~$200 these days could be \u201cgood enough\u201d for writing code, and\nyou could get by with a $50 or less phone for coding and still have a\ngreat experience (you\u2019d also need a keyboard and a mouse and a USB-2 to\nUSB-3 converter so you\u2019re not stuck on the default keypad, but that\u2019s\nnot expensive, the local mart had a pair for $10).

The folks at termux do a really good job packaging code \u2013 even though\nTermux Android doesn\u2019t follow the unix file specification (there is no\nroot dir), they repackage debian packages to make them work\nanyway. And my neovim config worked flawlessly on the phone. It was a\nsurreal experience \u2013 and one differentiator to apple, who locks down\ntheir devices a lot more.

I even tried briefly hosting my own website and other services on the\ninternet from my phone by using tailscale. That worked flawlessly, and\nI\u2019m sure my phone could serve thousands of concurrent visitors.

The experience made me think of two things: Do I really need a laptop\nthese days? I\u2019ll keep mine around for tinkering, but a phone is just\nabout good enough \u2013 with a USB-C monitor that can accept a keyboard and\nmouse, you could have a desktop setup for your phone, even with just one\nUSB-C port. And a phone just fits in your pocket, with a great battery\nlife, so you can take it on the go.

In sum, coding on my phone was a delightful experience. In fact, I\nliked it so much, I decided to write this post on my phone and build my\nblog on my phone and serve it from there. Given the backdrop of tech\ndoom these days, it\u2019s hard to find any tech news to feel happy about \u2013\nbut Termux + Android is so good that it\u2019s mindblowing. If anything, I\u2019m\nglad that computing has gotten so cheap \u2013 I\u2019m sure there\u2019s lots of young\npeople learning how to code on hardware just like this, and that\u2019s\nsomething that brings a smile to my face.

Writing Your own Libc

2023-02-16T16:05:35-05:00

\n\n

Note: This code was taken from Linux\u2019s nolibc: https://github.com/torvalds/linux/tree/master/tools/include/nolibc.\nCheck it out to learn more about implementing libc!

What are System Calls?

Let\u2019s write some libc functions.

Libc is C\u2019s standard library, which implements a group of functions\nthat can be used by all C programs. Libc provides wrappers for OS level\nconstructs, like printf, open,\nputs, and so on.

There\u2019s two parts to libc:

functions like max, or islower, which\ndon\u2019t require any system calls.
functions that do require system calls, like write,\nread, or open.

Functions in the first category can be implemented without calling\ninto the OS.

For example, max would look like this:

int max(int a, int b) {\n  return a > b ? a : b;\n}

or islower:

int islower(int c) {\n  return (c >= 'a' && c <= 'z') ? 1 : 0;\n}

However, when writing our own write or read\nor open, we hit a roadblock:

int open(const char* path, int flags, ...) {\n  // how do I open a file???\n}

Open is a system call that needs to manipulate hardware; we need to\nask the OS to do the action for us before being able to read and/or\nwrite to the file.

The OS supports an interface to facilitate that, called\nsystem calls. These system calls allow us to request the OS\nto do something on our behalf. These calls normally manipuluate hardware\nin some fashion, or have to do with processes.

So open might look like this:

int open(const char* path, int flags, ...) {\n  return system_call(SYSCALL_OPEN, path, flags, ...);\n}

And we defer to the OS, which takes care of everything for us.

That leaves the question: what should our system_call\nfunction look like? And what is SYSCALL_OPEN?

System Call Numbers

System calls take as their first argument a number, which indicates\nwhat system call the OS should execute. The OS then reads the first\nargument from the system_call function, looksup which\nsystem call it corresponds to, and the remaining arguments, and executes\nthat system call.

Let\u2019s say that we call open, which is the number 2:

#define SYSCALL_OPEN 2\nsystem_call(SYSCALL_OPEN, path, flags, ...);

The OS will then take that number and run the desired code.

#define SYSCALL_OPEN 2\n\nint system_call(SYSCALL syscall, ...) {\n  switch (syscall) {\n    case SYSCALL_OPEN:\n      // run code to open up a file in the hardware\n      break;\n    default:\n      break;\n  }\n}

We now need a correct list of system calls. Imagine if we thought\nSYSCALL_OPEN was 3, but the OS thought\n3 means close:

The computer could crash, our process could crash, anything could\nhappen.

#define SYSCALL_OPEN 3\nsystem_call(SYSCALL_OPEN, path, flags, ...);

#define SYSCALL_OPEN  2\n#define SYSCALL_CLOSE 3\n\nint system_call(SYSCALL syscall, ...) {\n  switch (syscall) {\n    case SYSCALL_OPEN:\n      // run code to open up a file in the hardware\n      break;\n    case SYSCALL_CLOSE:\n      // run code to close  a file\n      // oops, we called the wrong function\n      break;\n    default:\n      break;\n  }\n}

So we need to get that list of system calls:

You can find the system calls in your linux system with\nausyscall:

$ ausyscall --dump\n\nUsing x86_64 syscall table:\n0       read\n1       write\n2       open\n3       close\n...

This differs per architecture:

for example, on aarch64 (ARM 64 bit):

$ ausyscall aarch64 --dump\nUsing aarch64 syscall table:\n0       io_setup\n1       io_destroy\n2       io_submit\n3       io_cancel

We could define these ourselves, or rely on the system to export them\nat asm/unistd.h. I\u2019m going to include it instead of\nrewriting it.

#include <asm/unistd.h>

Writing system calls (with assembly)

Now that we have the system call number we need, let\u2019s write that\nsyscall!

We need to know a few things:

What the system call function is called, so we can tell the OS we\u2019re\nrunning a system call
Where to put the system call number
Where to put any other arguments required (open needs a\nfile to open, for example)
Where our return values are placed.

What better way to learn than through the manual pages: run\nman 2 syscall in the shell to learn more:

First, there\u2019s a table that tells us what a system call is called in\nthe instruction column. For x86-64, it is called\nsyscall. Next, the table tells us the register to put the\nsystem call number. In x86-64, it is rax. Finally, the\nregisters to check for return values, which in x86-64 are\nrax and rdx, and the register to check for\nerrors (in x86-64, no registers store errors after a system call).

Arch/ABI    Instruction           System  Ret  Ret  Error    Notes\n                                  call #  val  val2\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nalpha       callsys               v0      v0   a4   a3       1, 6\narc         trap0                 r8      r0   -    -\narm/OABI    swi NR                -       r0   -    -        2\narm/EABI    swi 0x0               r7      r0   r1   -\narm64       svc #0                w8      x0   x1   -\nblackfin    excpt 0x0             P0      R0   -    -\ni386        int $0x80             eax     eax  edx  -\nia64        break 0x100000        r15     r8   r9   r10      1, 6\nm68k        trap #0               d0      d0   -    -\nmicroblaze  brki r14,8            r12     r3   -    -\nmips        syscall               v0      v0   v1   a3       1, 6\nnios2       trap                  r2      r2   -    r7\nparisc      ble 0x100(%sr2, %r0)  r20     r28  -    -\npowerpc     sc                    r0      r3   -    r0       1\npowerpc64   sc                    r0      r3   -    cr0.SO   1\nriscv       ecall                 a7      a0   a1   -\ns390        svc 0                 r1      r2   r3   -        3\ns390x       svc 0                 r1      r2   r3   -        3\nsuperh      trapa #31             r3      r0   r1   -        4, 6\nsparc/32    t 0x10                g1      o0   o1   psr/csr  1, 6\nsparc/64    t 0x6d                g1      o0   o1   psr/csr  1, 6\ntile        swint1                R10     R00  -    R01      1\nx86-64      syscall               rax     rax  rdx  -        5\nx32         syscall               rax     rax  rdx  -        5\nxtensa      syscall               a2      a2   -    -

Later down the page, there\u2019s another table that shows where arguments\ngo.

For x86-64, rdi rsi rdx\nr10 r8 r9 are the registers to\nput arguments in order, with rax being the system call\nnumber.

An interesting thing to note: mips/o32 here only\nsupports 4 arguments in registers. That doesn\u2019t necessarily mean it only\nsupports system calls with 4 or less arguments \u2013 arguments 5 through 8\nare placed on the stack and read when the system call instruction is\nexecuted.

Arch/ABI      arg1  arg2  arg3  arg4  arg5  arg6  arg7  Notes\n\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nalpha         a0    a1    a2    a3    a4    a5    -\narc           r0    r1    r2    r3    r4    r5    -\narm/OABI      r0    r1    r2    r3    r4    r5    r6\narm/EABI      r0    r1    r2    r3    r4    r5    r6\narm64         x0    x1    x2    x3    x4    x5    -\nblackfin      R0    R1    R2    R3    R4    R5    -\ni386          ebx   ecx   edx   esi   edi   ebp   -\nia64          out0  out1  out2  out3  out4  out5  -\nm68k          d1    d2    d3    d4    d5    a0    -\nmicroblaze    r5    r6    r7    r8    r9    r10   -\nmips/o32      a0    a1    a2    a3    -     -     -     1\nmips/n32,64   a0    a1    a2    a3    a4    a5    -\nnios2         r4    r5    r6    r7    r8    r9    -\nparisc        r26   r25   r24   r23   r22   r21   -\npowerpc       r3    r4    r5    r6    r7    r8    r9\npowerpc64     r3    r4    r5    r6    r7    r8    -\nriscv         a0    a1    a2    a3    a4    a5    -\ns390          r2    r3    r4    r5    r6    r7    -\ns390x         r2    r3    r4    r5    r6    r7    -\nsuperh        r4    r5    r6    r7    r0    r1    r2\nsparc/32      o0    o1    o2    o3    o4    o5    -\nsparc/64      o0    o1    o2    o3    o4    o5    -\ntile          R00   R01   R02   R03   R04   R05   -\nx86-64        rdi   rsi   rdx   r10   r8    r9    -\nx32           rdi   rsi   rdx   r10   r8    r9    -\nxtensa        a6    a3    a4    a5    a8    a9    -

With all that information out of the way, we want to write a function\nthat does the following:

Write out the system call instruction in assembly\n(syscall for x86).
Set the rax register to the system call we want to\ncall.
Read the register has the return value and return it.

On x86, the registers rcx, r11,\ncc, and memory are clobbered by the syscall,\nso our assembly call must include them in the last line of the syscall\ninstruction. The last line notes the clobbered registers that are\noverwritten by the OS, as well as other directives.

For an explanation why rcx and r11 are\nclobbered.

rcx

\n
rcx is clobbered to store the address of the next\ninstruction to return to.
\n

r11

\n
r11 is clobbered to store the value of the rflags\nregister.
\n

And for cc and memory, from: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm

cc

\n
The cc clobber indicates that the assembler code\nmodifies the flags register. On some machines, GCC represents the\ncondition codes as a specific hardware register; \u201ccc\u201d serves to name\nthis register. On other machines, condition code handling is different,\nand specifying \u201ccc\u201d has no effect. But it is valid no matter what the\ntarget.
\n

memory

\n
The memory clobber tells the compiler that the assembly\ncode performs memory reads or writes to items other than those listed in\nthe input and output operands (for example, accessing the memory pointed\nto by one of the input parameters). To ensure memory contains correct\nvalues, GCC may need to flush specific register values to memory before\nexecuting the asm. Further, the compiler does not assume that any values\nread from memory before an asm remain unchanged after that asm; it\nreloads them as needed. Using the \u201cmemory\u201d clobber effectively forms a\nread/write memory barrier for the compiler.
\n

\n
Note that this clobber does not prevent the processor from doing\nspeculative reads past the asm statement. To prevent that, you need\nprocessor-specific fence instructions.
\n

So on x86, a system call will look like:

#define syscall0(num)                     \\\n({                                        \\\n    long _ret;                              \\\n    register long _num  asm("rax") = (num); \\\n                                            \\\n    asm volatile (                          \\\n        "syscall\\n"                           \\\n        : "=a"(_ret)                          \\\n        : "0"(_num)                           \\\n        : "rcx", "r11", "memory", "cc"        \\\n    );                                      \\\n    _ret;                                   \\\n})

For the Arm64 (aarch64) version:

#define syscall0(num)                    \\\n({                                       \\\n    register long _num  asm("x8") = (num); \\\n    register long _arg1 asm("x0");         \\\n                                           \\\n    asm volatile (                         \\\n        "svc #0\\n"                           \\\n        : "=r"(_arg1)                        \\\n        : "r"(_num)                          \\\n        : "memory", "cc"                     \\\n    );                                     \\\n    _arg1;                                 \\\n})

Writing a C Function that makes a system call

Let\u2019s write our first libc function, getpid.

getpid returns the pid of the current\nprocess.

It has a signature of pid_t getpid(void);, where\npid_t is int.

All we have to do is to make the right system call to the OS and\nreturn it.

typedef int pid_t;\n\npid_t getpid(void) {\n    return syscall0(__NR_getpid);\n}

This should return your pid, and we\u2019re done creating a libc function\nthat calls into the kernel.

Putting it all together

To put it all together, we need to write the _start\nfunction of our program, since we are going to link to our own libc.

For x86, that means adding this code to the top of your file:

asm(".section .text\\n"\n    ".weak _start\\n"\n    ".global _start\\n"\n    "_start:\\n"\n    "pop %rdi\\n"                // argc   (first arg, %rdi)\n    "mov %rsp, %rsi\\n"          // argv[] (second arg, %rsi)\n    "lea 8(%rsi,%rdi,8),%rdx\\n" // then a NULL then envp (third arg, %rdx)\n    "xor %ebp, %ebp\\n"          // zero the stack frame\n    "and $-16, %rsp\\n"          // x86 ABI : esp must be 16-byte aligned before call\n    "call main\\n"               // main() returns the status code, we'll exit with it.\n    "mov %eax, %edi\\n"          // retrieve exit code (32 bit)\n    "mov $60, %eax\\n"           // NR_exit == 60\n    "syscall\\n"                 // really exit\n    "hlt\\n"                     // ensure it does not return\n    "");

This sets up everything main needs to run.

For ARM64 (aarch64):

asm(".section .text\\n"\n    ".weak _start\\n"\n    ".global _start\\n"\n    "_start:\\n"\n    "ldr x0, [sp]\\n"              // argc (x0) was in the stack\n    "add x1, sp, 8\\n"             // argv (x1) = sp\n    "lsl x2, x0, 3\\n"             // envp (x2) = 8*argc ...\n    "add x2, x2, 8\\n"             //           + 8 (skip null)\n    "add x2, x2, x1\\n"            //           + argv\n    "and sp, x1, -16\\n"           // sp must be 16-byte aligned in the callee\n    "bl main\\n"                   // main() returns the status code, we'll exit with it.\n    "mov x8, 93\\n"                // NR_exit == 93\n    "svc #0\\n"\n    "");

And aggregating that together:

For x86:

#include <asm/unistd.h>\n\n#define syscall0(num)                     \\\n({                                        \\\n    long _ret;                              \\\n    register long _num  asm("rax") = (num); \\\n                                            \\\n    asm volatile (                          \\\n        "syscall\\n"                           \\\n        : "=a"(_ret)                          \\\n        : "0"(_num)                           \\\n        : "rcx", "r11", "memory", "cc"        \\\n    );                                      \\\n    _ret;                                   \\\n})\n\nasm(".section .text\\n"\n    ".weak _start\\n"\n    ".global _start\\n"\n    "_start:\\n"\n    "pop %rdi\\n"                // argc   (first arg, %rdi)\n    "mov %rsp, %rsi\\n"          // argv[] (second arg, %rsi)\n    "lea 8(%rsi,%rdi,8),%rdx\\n" // then a NULL then envp (third arg, %rdx)\n    "xor %ebp, %ebp\\n"          // zero the stack frame\n    "and $-16, %rsp\\n"          // x86 ABI : esp must be 16-byte aligned before call\n    "call main\\n"               // main() returns the status code, we'll exit with it.\n    "mov %eax, %edi\\n"          // retrieve exit code (32 bit)\n    "mov $60, %eax\\n"           // NR_exit == 60\n    "syscall\\n"                 // really exit\n    "hlt\\n"                     // ensure it does not return\n    "");\n\ntypedef int pid_t;\n\npid_t getpid(void) {\n    return syscall0(__NR_getpid);\n}\n\nint main() {\n  return getpid();\n}

For ARM64 (aarch64):

#define syscall0(num)                    \\\n({                                       \\\n    register long _num  asm("x8") = (num); \\\n    register long _arg1 asm("x0");         \\\n                                           \\\n    asm volatile (                         \\\n        "svc #0\\n"                           \\\n        : "=r"(_arg1)                        \\\n        : "r"(_num)                          \\\n        : "memory", "cc"                     \\\n    );                                     \\\n    _arg1;                                 \\\n})\n\nasm(".section .text\\n"\n    ".weak _start\\n"\n    ".global _start\\n"\n    "_start:\\n"\n    "ldr x0, [sp]\\n"              // argc (x0) was in the stack\n    "add x1, sp, 8\\n"             // argv (x1) = sp\n    "lsl x2, x0, 3\\n"             // envp (x2) = 8*argc ...\n    "add x2, x2, 8\\n"             //           + 8 (skip null)\n    "add x2, x2, x1\\n"            //           + argv\n    "and sp, x1, -16\\n"           // sp must be 16-byte aligned in the callee\n    "bl main\\n"                   // main() returns the status code, we'll exit with it.\n    "mov x8, 93\\n"                // NR_exit == 93\n    "svc #0\\n"\n    "");\n\ntypedef int pid_t;\n\npid_t getpid(void) {\n    return syscall0(__NR_getpid);\n}\n\nint main() {\n  return getpid();\n}

Now to compile this program, we can\u2019t link to libc, so, assuming the\nc file is called main.c

$ gcc -static -lgcc -nostdlib -g main.c -o main

to compile it, and run it:

$ ./main

Grab the exit status of the binary:

$? # some random number

And we\u2019re done with implementing getpid!

Reading and writing to files

getpid is a fine starting function, but we want to be\nable to read and write to files.

Let\u2019s start by defining the system calls that take 1 argument to 3\narguments:

In x86-64:

#define syscall1(num, arg1)                      \\\n({                                               \\\n    long _ret;                                     \\\n    register long _num  asm("rax") = (num);        \\\n    register long _arg1 asm("rdi") = (long)(arg1); \\\n                                                   \\\n    asm volatile (                                 \\\n        "syscall\\n"                                  \\\n        : "=a"(_ret)                                 \\\n        : "r"(_arg1),                                \\\n          "0"(_num)                                  \\\n        : "rcx", "r11", "memory", "cc"               \\\n    );                                             \\\n    _ret;                                          \\\n})\n\n#define syscall2(num, arg1, arg2)                \\\n({                                               \\\n    long _ret;                                     \\\n    register long _num  asm("rax") = (num);        \\\n    register long _arg1 asm("rdi") = (long)(arg1); \\\n    register long _arg2 asm("rsi") = (long)(arg2); \\\n                                                   \\\n    asm volatile (                                 \\\n        "syscall\\n"                                  \\\n        : "=a"(_ret)                                 \\\n        : "r"(_arg1), "r"(_arg2),                    \\\n          "0"(_num)                                  \\\n        : "rcx", "r11", "memory", "cc"               \\\n    );                                             \\\n    _ret;                                          \\\n})\n\n#define syscall3(num, arg1, arg2, arg3)          \\\n({                                               \\\n    long _ret;                                     \\\n    register long _num  asm("rax") = (num);        \\\n    register long _arg1 asm("rdi") = (long)(arg1); \\\n    register long _arg2 asm("rsi") = (long)(arg2); \\\n    register long _arg3 asm("rdx") = (long)(arg3); \\\n                                                   \\\n    asm volatile (                                 \\\n        "syscall\\n"                                  \\\n        : "=a"(_ret)                                 \\\n        : "r"(_arg1), "r"(_arg2), "r"(_arg3),        \\\n          "0"(_num)                                  \\\n        : "rcx", "r11", "memory", "cc"               \\\n    );                                             \\\n    _ret;                                          \\\n})

In ARM64 (aarch64):

#define syscall1(num, arg1)                       \\\n({                                                \\\n    register long _num  asm("x8") = (num);          \\\n    register long _arg1 asm("x0") = (long)(arg1);   \\\n                                                    \\\n    asm volatile (                                  \\\n        "svc #0\\n"                                    \\\n        : "=r"(_arg1)                                 \\\n        : "r"(_arg1),                                 \\\n          "r"(_num)                                   \\\n        : "memory", "cc"                              \\\n    );                                              \\\n    _arg1;                                          \\\n})\n\n#define syscall2(num, arg1, arg2)                 \\\n({                                                \\\n    register long _num  asm("x8") = (num);          \\\n    register long _arg1 asm("x0") = (long)(arg1);   \\\n    register long _arg2 asm("x1") = (long)(arg2);   \\\n                                                    \\\n    asm volatile (                                  \\\n        "svc #0\\n"                                    \\\n        : "=r"(_arg1)                                 \\\n        : "r"(_arg1), "r"(_arg2),                     \\\n          "r"(_num)                                   \\\n        : "memory", "cc"                              \\\n    );                                              \\\n    _arg1;                                          \\\n})\n\n#define syscall3(num, arg1, arg2, arg3)           \\\n({                                                \\\n    register long _num  asm("x8") = (num);          \\\n    register long _arg1 asm("x0") = (long)(arg1);   \\\n    register long _arg2 asm("x1") = (long)(arg2);   \\\n    register long _arg3 asm("x2") = (long)(arg3);   \\\n                                                    \\\n    asm volatile (                                  \\\n        "svc #0\\n"                                    \\\n        : "=r"(_arg1)                                 \\\n        : "r"(_arg1), "r"(_arg2), "r"(_arg3),         \\\n          "r"(_num)                                   \\\n        : "memory", "cc"                              \\\n    );                                              \\\n    _arg1;                                          \\\n})

Next, some definitions that the libc functions will use:

typedef int pid_t;\ntypedef int mode_t;\ntypedef int ssize_t;\ntypedef unsigned long long size_t;\n\n#define STDIN_FILENO  0\n#define STDOUT_FILENO 1\n#define STDERR_FILENO 2

And flags for calls to open:

For x86-64:

#define O_RDONLY            0\n#define O_WRONLY            1\n#define O_RDWR              2\n#define O_CREAT          0x40\n#define O_EXCL           0x80\n#define O_NOCTTY        0x100\n#define O_TRUNC         0x200\n#define O_APPEND        0x400\n#define O_NONBLOCK      0x800\n#define O_DIRECTORY   0x10000

For Arm64 (aarch64):

#define O_RDONLY            0\n#define O_WRONLY            1\n#define O_RDWR              2\n#define O_CREAT          0x40\n#define O_EXCL           0x80\n#define O_NOCTTY        0x100\n#define O_TRUNC         0x200\n#define O_APPEND        0x400\n#define O_NONBLOCK      0x800\n#define O_DIRECTORY    0x4000

Finally, let\u2019s define the functions we\u2019ll use:

ssize_t close(int fd) {\n  return syscall1(__NR_close, fd);\n}\n\nint fsync(int fd) {\n    return syscall1(__NR_fsync, fd);\n}\n\nssize_t read(int fd, void *buf, size_t count)\n{\n    return syscall3(__NR_read, fd, buf, count);\n}\n\nint open(const char *path, int flags, mode_t mode) {\n    return syscall3(__NR_open, path, flags, mode);\n}\n\nssize_t write(int fd, const void *buf, size_t count) {\n    return syscall3(__NR_write, fd, buf, count);\n}

And a helper function, strlen:

size_t strlen(const char *str) {\n    size_t len;\n\n    for (len = 0; str[len]; len++)\n        asm("");\n    return len;\n}

Finally, we can start writing a main function that uses this\ncode:

int main() {\n  const char* text = "hello world\\n"; // text to write\n  write(STDOUT_FILENO, text, strlen(text)); // write text to stdout\n  fsync(STDOUT_FILENO); // flush stdout\n\n  const char* file_text = "Hello from file"; // text to write to file\n  int fd = open("hello.txt", O_CREAT | O_TRUNC | O_RDWR, 0666); // open, truncate, create file hello.txt\n  write(fd, file_text, strlen(file_text)); // write the file text to file\n  fsync(fd); // flush hello.txt\n  close(fd); // close hello.txt\n\n  fd = open("hello.txt", O_RDONLY, 0666); // open the file hello.txt for reading\n\n  char read_from_file[strlen(file_text) + 1]; // the buffer to read into\n  read(fd, (void *)read_from_file, strlen(file_text)); // read from hello.txt to the buffer\n  read_from_file[strlen(file_text)] = '\\n'; // add a new line to the buffer\n  write(STDOUT_FILENO, read_from_file, strlen(read_from_file)); // write the buffer to stdout\n  fsync(STDOUT_FILENO); // flush stdout\n  close(fd); // close hello.txt\n}

After compiling it as above, you should have the following text:

hello world\nHello from file

With hello.txt containing\nHello from file.

Writing some simple libc code isn\u2019t that hard!

A Guide to RSS

2023-02-08T08:15:00-05:00

\n\n

At the start of the year I decided to read more quality stuff instead\nof mindlessly doom scrolling the internet. An internet diet. I believe\nin discussing tools + methods to solve problems, so here are some\nfindings I had setting up an RSS feed, reading from it, and some of the\nchallenges along the way.

Setup

If you\u2019re like me, you want a few things:

Offline reading
Syncing
Caching
Mobile + Terminal + Web + Desktop clients

To feasibly implement syncing across multiple devices, we\u2019ll need a\nserver to host our RSS feed. I self-host FreshRSS, which acts as\nboth a web client and the server. It allows for syncing from many\nclients, which is great, and it has an easy docker-compose file to pull\nin all necessary dependencies. FreshRSS exposes an API as well as a\nGoogle Reader compatible API, which most clients support. This supports\nauthentication, which is a nice feature for locking down your\nserver.

Finally, we\u2019ll need clients that implement offline reading and heavy\ncaching for good performance, for mobile, the terminal, and the\ndesktop.

I use an Samsung S10 as my phone, which runs android. The app I use\non mobile is FeedMe, hooked up to my FreshRSS server through the Google\nReader API. This heavily caches and works well offline, so on the go I\ncan read some feeds.

For a desktop client, I run a linux laptop, so I use Newsflash.\nNewsflash is similar to FeedMe, with support for the Google Reader API,\nand good offline capabilities. I tried a few other services, but they\ncouldn\u2019t parse my large RSS feed.

For a terminal client, I use newsboat. Newsboat is a curses-based\ntext only feed, so RSS feeds that don\u2019t send over the full article\ncontent are a bad reading experience on newsboat. To deal with this, I\nuse morss, which scrapes the link provided by the RSS feed,\nand then generates a new RSS feed with the content of the blog\nincluded.

morss can be\ndownloaded with pip.

Now you should have everything set up, and can add some feeds to your\nserver, pull them down to your clients, and are ready to go.

But what if you\u2019re new to this RSS game, and want to read some older\nposts?

Reading Older RSS Posts

Did you know that RSS can support older posts, and that blog\nplatforms like wordpress, blogger, and blogspot have those capabilities\nbuilt in? For example, let\u2019s say I wanted to read James Clear\u2019s old\nposts. His current feed, located at\nhttps://jamesclear.com/feed has 10 of his most recent blog\nposts. But he\u2019s published 300 more.

His blog actually has those old posts in RSS form for us, we just\nhave to find it.

Some blog sites\u2019 RSS supports the paged query parameter.\npaged=1 means the first page, and paged=2\nmeans the second page. In Jamesclear\u2019s case, we just have to binary\nsearch to find the last RSS feed supported by the website.

We could start out at\nhttps://jamesclear.com/feed?paged=100, realize that doesn\u2019t\nwork, try https://jamesclear.com/feed?paged=50, realize\nthat doesn\u2019t work, go to\nhttps://jamesclear.com/feed?paged=25, see that works, and\nthen figure out the last page, which as of now is page 30.

We can then download all of those rss feeds with a handy dandy bash\nscript:

#!/usr/bin/env bash\n\nfor num in $(seq 1 30); do\n  wget "https://jamesclear.com/feed?paged=$num" -O $num.rss\ndone

And then we can download the 30 rss feeds.

We now have 30 rss feeds that have the content we want to read. But\nhaving to add 30 entries to our RSS client is a bit of a pain: let\u2019s\naggregate them.

I wrote a little script that would batch these into feeds with 150\nposts each, using feedgen and feedparser.

#!/usr/bin/env python3\n\nimport feedparser\nfrom feedgen.feed import FeedGenerator\nfrom glob import glob\n\nfiles = glob('*.rss')\nfile_count = len(files)\nfile_generators = []\n\nfor i in range((file_count // 15) + 1):\n    fg = FeedGenerator()\n    fg.id('https://jamesclear.com/')\n    fg.title(f'James Clear Page {file_count // 15 - i + 1}')\n    fg.link(href='https://jamesclear.com/')\n    fg.author( {'name':'James Clear','email':'jamesclear@gmail.com'} )\n    fg.link(href='https://jamesclear.com/', rel='alternate' )\n    fg.logo('https://jamesclear.com/favicon.ico')\n    fg.subtitle('An Easy & Proven Way to Build Good Habits & Break Bad Ones')\n    fg.link(href='https://jamesclear.com/feed', rel='self' )\n    fg.language('en')\n    file_generators.append(fg)\n\nfor index, file_name in enumerate(files):\n    file_generator_index = index // 15\n    with open(file_name) as f:\n        xml = f.read()\n        d = feedparser.parse(xml)\n        for entry in d.entries:\n            fe = file_generators[file_generator_index].add_entry()\n            fe.id(entry['link'])\n            fe.title(entry['title'])\n            fe.link(href=entry['link'])\n            fe.content(entry['content'][0]['value'])\n            fe.description(entry['summary'])\n            fe.author(entry['authors'][0])\n            fe.guid(entry['link'])\n            fe.pubDate(entry['published'])\n\nfor index, fg in enumerate(reversed(file_generators)):\n    fg.rss_file(f"../site/james-clear-{index + 1}.rss")

We can then aggregate these rss feeds into two rss feeds, and then\nput them in a folder. I host my RSS feeds on the web using netlify,\nhttps://takashis-rss.netlify.app/ so that my RSS server can\npull these feeds and get the content from them.

You can use any web hosting service you like, or self-host it. It\u2019s\nup to you, it just needs to be online for the RSS server to be able to\npull down.

What about Atom?

Not all feeds are RSS feeds, some are Atom feeds. Take\nhttps://blog.computationalcomplexity.org/ for example. Atom doesn\u2019t\nsupport the paged query parameter. But it supports\nsomething even better. We\u2019ll first download their feed at\nhttps://blog.computationalcomplexity.org/feeds/posts/default and open it\nup in a text editor. Inside, we should be able to find that it has 2980\narticles, and they can be queried with a different query paramter,\ncalled start-index.

So\nhttps://blog.computationalcomplexity.org/feeds/posts/default?start-index=100\nfetches the 100th - 125th blog post. We want to do this until we hit the\n2980th article:

To do that, we edit the previous script a bit:

#!/usr/bin/env bash\n\nfor num in $(seq 1 25 2980); do\n  wget "https://blog.computationalcomplexity.org/feeds/posts/default?start-index=$num" -O $num.rss\ndone

And we can fetch all the posts, and aggregate them much the same way\nas above.

Rendering more content with morss

This is good enough for some feeds, but some feeds only show a little\nbit of content, which is undesirable for terminal readers like\nnewsboat, which don\u2019t fetch the entire article, just the\nRSS content.

We can download morss with pip and run this\nscript, which renders the feed with a web browser and scrapes it into an\nRSS feed.

#!/usr/bin/env bash\n\nfor num in $(seq 1 37); do\n  LIM_ITEM=-1 MAX_ITEM=-1 morss "https://travelfreak.com/feed?paged=$num" > $num.rss\ndone

This lets us get around those pesky feeds with no content and just a\nlink.

How to index your library

2023-01-23T08:34:32-05:00

Books are nice. They make transferring knowledge easy, and are a way\nto record things for the future. What\u2019s not nice is searching through\nthem. Who uses a glossary nowadays?

Another problem: my collection of books is too large to put in my\ntiny apartment:

$ find . -type f -name "*.pdf" | wc -l\n223

But if I have these books, I might as well index them, so I can\nsearch for them quickly:

First, since pdfs are binary, we\u2019ll have to extract their textual\ncontent to a file:

#!/usr/bin/env bash\n\ntitle_case() {\n  sed 's/.*/\\L&/; s/[a-z]*/\\u&/g' <<< $1 | tr '-' ' '\n}\n\nfor f in $(find . -type f -name "*.pdf"); do\n  based_name=$(basename $f .pdf)\n  txt_name="${f%.*}.txt"\n  if [[ -f "$txt_name" ]]; then\n    echo "$txt_name exists."\n    continue\n  fi\n  pdftotext $f\n  title_cased_name=$(title_case $based_name)\n  echo -e "$title_cased_name\\n\\n" | cat - $txt_name | sponge $txt_name\ndone

Next, we have to index the actual content. To do that, we\u2019ll put our\ntext content into a search engine, like sonic.

Grab the binary and set it up.

Next, we\u2019ll grab a client to pass data to sonic:

Create a Gemfile in a directory:

source 'https://rubygems.org'\ngem 'sonic-ruby'

Bundle install the gem:

$ bundle install

Next, create a file called ingest.rb. This will ingest\nyour textual data:

Since the client I\u2019m using right now doesn\u2019t have throttling, it\ncauses a panic caused by a buffer overflow on the search engine by\nshoving too much data too quickly. To fix this, I overwrite that method\nin the file and use it.

require 'sonic-ruby'\n\nmodule Sonic\n  module Channels\n    refine Ingest do\n      def push(collection, bucket, object, text, lang = nil)\n        puts "processing #{object}"\n        text_size = text.size\n        text = text.encode('UTF-8', :invalid => :replace, :undef => :replace)\n        right = 5000\n        left = 0\n        loop do\n          break if right > text_size\n\n\n          arr = [collection, bucket, object, quote(text[left...right].gsub('"', ''))]\n          arr << "LANG(#{lang})" if lang\n\n          execute('PUSH', *arr)\n          right += 5000\n          left += 5000\n        end\n      end\n    end\n  end\nend\n\nusing Sonic::Channels\n\n# Connect to the Sonic server on localhost:1491\nclient = Sonic::Client.new('localhost', 1491, 'SecretPassword')\n\n# Connect to the ingest channel\ningest = client.channel(:ingest)\n\nDir.glob("$PATH_TO_FILES/**/*.txt") do |f|\n  pdf_name = f[0..-3] + 'pdf'\n  text = IO.read(f)\n  ingest.push('books', 'all', pdf_name, text)\nend

Run that script with:

ruby ingest.rb

And then lets get searching! Create this file as\nsearch.rb:

require 'sonic-ruby'\n\nif ARGV.length != 1\n  puts "Too many names ... or not enough name?"\n  exit\nelse\n  name = ARGV[0]\nend\n\n# Connect to the Sonic server on localhost:1491\nclient = Sonic::Client.new('localhost', 1491, 'SecretPassword')\n\n# Connect to the search channel\nsearch = client.channel(:search)\n\nputs "searching for #{name}"\n\n# Search for a matching name and return ID\nsearch.query('books', 'all', name, 100).split(' ').each do |doc|\n  puts doc\nend

And then run a search:

$ ruby search.rb "Oysters"\nsearching for Oysters\n/books/programming-languages/python/effective-python.pdf\n/books/math/probability-theory-the-logic-of-science.pdf\n/books/algorithms/programming-pearls.pdf

And that\u2019s it. How to index your books in 15 minutes or less,\nguaranteed or your money back.

Recommended Resources

2023-01-18T21:39:20-05:00

\n\n

Here\u2019s a list of resources I enjoyed, with a few comments.

Papers

Hoard:\nA Scalable Memory Allocator for Multithreaded Applications: Hoard is\na parallel memory allocator that avoids fragmentation and false sharing\nby hoarding memory from the single-threaded system memory allocators and\ngiving it out in parallel for better performance.
The\nEmperor\u2019s Old Clothes: Hoare, of quicksort and ALGOL fame, explains\nwhy simplicitly is a virtue, and how a committee can destroy a language.\nHe mentions how bounds checking was implemented in ALGOL, and\ncomposition allowed it to grow to become a simple yet powerful\nlanguage.
Growing\na Language: A paper about how to build a language to grow \u2013 a\nlanguage should be flexible, have a welcoming community, have generics\nand operator overloading, and worse is better.
Three\nApproaches to the Quantitative Definition of Information: This paper\nformulates what\u2019s now called Kolmogorov complexity, which states that\nthe entropy of an object is determined by the smallest possible\nprogramming language that can express said information. This explains\ncompression, signal processing, and many other things in CS in just 5\npages.
Technology and\nCourage: A paper by technology pioneer, Ivan Sutherland, who was on\nthe team at MIT who built the first tablet. This paper goes into his\nhigh level thoughts about business, software, and life.
Reflections\non Trusting Trust: A classic paper on how software isn\u2019t really\ntrustable unless you examine both the tools to build it and the code\nitself. This paper was surprisngly prescient, given the security\nproblems we have now with software.
Hazard\nPointers: Safe Memory Reclamation for Lock-Free Objects: It was\nthought for a long time that garbage collection was a prerequisite to\nfast concurrent data structures, due to the lack of efficient\nbookkeeping for when to free parts of a data structure correctly. This\npaper discusses hazard pointers, a way to mark parts of a data structure\nas freeable even without garbage collection.
A\nLazy Concurrent List-Based Set Algorithm: This paper details\nconcurrent Skip Lists, which was implemented in the java collections\nlibrary in Java 1.6.
Crash\nOnly Software: A paper on explaining why crash-only software is\ngood, and a classification of such software.
Better bitmap\nperformance with Roaring bitmaps: Roaring bitmaps, a faster data\nstructure and more storage efficient data structure for bitmaps, by\nusing both run-length encoding and array packing.
RRB-Trees: Efficient\nImmutable Vectors: RRB-Trees, the Relaxed Radix Balanced Tree, is a\npurely functional data structure that is an improved version of the\nHAMT, the Hash Array Mapped Trie, which is the vector data structure in\nScala and Clojure.
Time\nBounds for Selection: A paper that explains the PICK\nselection algorithm, which can select the ith smallest of n numbers in\nO(n) time. This algorithm is more commonly known as quickselect.
Quicksort:\nA paper that explains the classic quicksort algorithm, the first O(n log\nn) sorting algorithm that took sublinear memory.
End\nto End Arguments in System Design: A paper on a design principle,\nthe \u201cEnd to End\u201d argument, that explains why having functionality at the\nlower levels of a system may be redundant or useless compared to putting\nthem at a higher level of a system.
Pattern Defeating\nQuicksort: A sorting algorithm for the Dutch National flag problem,\nwhich can solve it in O(nk) time. This is the current unstable sort\nalgorithm in rust, with an ~5-10% better performance over the current\nrust stable sort, timsort.

Courses

MIT\nPerformance Engineering of Software Systems: A good course to learn\nmore about computer architecture and what\u2019s going on under the hood when\nyou execute code.
Design\nand Implementation of Programming Languages: A course that\nincrementally introduces compilers, by implementing a compiler that\nemits x86 assembly.
The Modern\nAlgorithmic Toolbox: A course that thoroughly explains useful\nalgorithms and what they can be used for, with lots of real world\nexamples.
Database\nSystems: A course all about databases, explaining algorithms and\ndata structures for indexes, storage, logging, locking, and concurrency\nprotocols, like MVCC and 2PL.

Books

ARM\nSystem Developer\u2019s Guide: Details the ARM ISA. A bit dated at this\npoint, but covers the fundamentals.
Algorithms\nand Data Structures for Massive Datasets: Algorithms and data\nstructures that scale to meet the demands of large datasets.
Algorithms for Modern\nHardware: A great resource for learning about performance\nengineering.
Antifragile:\nA sequel to the Black Swan, focusing on things that get stronger when\nput under stress, and how to build systems that do the same.
Behind\nDeep Blue: A book about the team that built deep blue, the first\ncomputer to defeat the world chess champion.
Computational\nGeometry: a book that looks at algorithms geometrically. There\u2019s\nsections on calculating nearest neighbors, object collision, mapping\nalgorithms, dimension reduction, graphs, and even querying a database. I\ndidn\u2019t know computational geometry had so many applications!
Crafting\nInterpreters: Learn about compilers by implementing two interpreters\nfor a full featured language named lox.
Database Internals: A Deep\nDive: A book that teaches databases in two parts: storage engines,\nand then as distributed databases.
Designing\nData Driven Databases: The best book for learning about distributed\nsystems.
Game Programming\nPatterns: A book about Design Patterns for game development. The\nexamples are in C++, and focused on games, but can be applied to many\ndomains outside of it.
High\nPerformance MySQL: Learn about how to use and deploy MySQL, while\nsqueezing as much performance as possible while dodging pitfalls.
Irrational\nExuberance: A book about economic bubbles and regression to the\nmean.
Kill\nit with Fire: explains how to maintain and extend legacy systems,\nwith notes on leading teams and driving change.
Learn you a Haskell for Great\nGood!: An introduction to haskell, a statically and strongly typed\nfunctional programming language.
Learn you some Erlang for\nGreat Good!: An introductory book to Erlang/OTP, a functional\nprogramming language with lots of libraries suited for web\nprogramming.
Meditations:\nA classic book by Marcus Aurelius on the stoic philosophy, with gems\nthat still ring true today.
Operating Systems,\nThree Easy Pieces: A book that tackles teaching operating systems in\nthree pieces: virtualization, concurrency, and persistence.
Probabilistic\nData Structures and Algorithms for Big Data Applications: A book on\nprobabilistic data structures that are useful for big data. The six\nchapters cover many data structures for each problem.
Programming\nPearls: Algorithms and data structures that Jon Bentely, the creator\nof kd-trees explains in succint prose. There\u2019s a lot of great exercises\nand the author\u2019s storytelling makes the book an entertaining and fast\nread.
Proofs:\nA long form Mathematics Textbook: An approachable book on proofs\nwith lots of problems and stories. Proofs and being able to read\nmathematical notation are much more useful than I would\u2019ve thought.
Purely\nFunctional Data Structures: A book on data structures for functional\nlanguages, like SML.
SQL Performance\nExplained: The book that made indexes click for me, with\naccompanying SQL code in many dialects of SQL, like SQL Server, Oracle,\nMySQL, and Postgres.
Seven\nConcurrency Models in Seven Weeks: concurrency models in many\ndifferent languages, including java, clojure, and erlang, and how they\ndiffer, with quick explanations on the pros and cons of each.
Seven\nDatabases in Seven Weeks: a whirlwind tour of some common NoSQL\ndatabases, like Redis, Neo4j, DynamoDB, and Hbase.
Systems\nPerformance: Enterprise and the Cloud: Learn how to profile and\nimprove performance and observability of systems. A must have for\nanybody learning in a big tech company.
The\nArt of Multiprocessor Programming: A book that goes into great depth\nabout concurrent programming, with thorough exercises.
The\nBlack Swan: An options trader explains why outlier events are\nactually quite common and hard to hedge against. I picked this up for\nthe nod to Popper, but stayed for the thoughts on the market.
The\nInnovator\u2019s Dilemma: How companies adjust (and don\u2019t adjust) to\nchange, and how incumbents lose market share to new competitors.
The Linux Programming\nInterface: a book on the linux OS, glibc, and system calls. Waiting\nfor a second edition soemday to cover io_uring, cgroups, and other\nthings.
The\nLittle Book of Semaphores: A gentle introduction to mutual exclusion\nby introducing semaphores, the basis of almost all concurrent\nalgorithms.

Utilities

fish: A shell with completions\nand sane scripting
pagefind:\nStatic search that works offline without needing to run a local\nserver
pandoc: A universal document\nconverter
pdf2txt: Reading\npdfs as plain text
ripgrep: Grep\nbut faster
scalene: To\ndiagnose python performance problems
termux: Run a unix terminal on\nyour android phone
panamax: Mirror\ncrates.io locally

The agora and the hivemind

2022-06-11T19:53:26-05:00

Have you ever felt like there\u2019s so much to do, but you never make any\nprogress? When I worked as an office worker, my manager drilled it into\nus that to remain competitive in an increasingly connected global\neconomy, workers would have to be able to multitask \u2013 handle many\nrequests in flight concurrently and work on them in parallel, bringing\ntogether stakeholders to build consensus.

As a line cook in a restaurant, it was expected that cooks would be\nable to do any role in the restaurant \u2013 order ingredients, clean dishes,\nserve customers, prep ingredients, the whole shebang, especially in\nparallel.

As a warehouse worker, we were expected to be able to handle a larger\nnumber of requests every month, with the same or smaller team, mentally\njuggling many POs at the same time to meet tighter deadlines for the\nsake of efficiency.

As a tech worker, we were expected to exceed for our customers \u2013\nanswer pings at 3 a.m, make sure our service was free of security\ndefects, make a bi-weekly \u201cagile\u201d show and tell, and work on our project\nwhile having 7 hours of meetings a day.

All these jobs had disastrous consequences \u2013 the office workers\nbecame so unproductive that they quit in droves, many of the line cooks\nsuffered physical injury, including losing a finger, and the warehouse\nworkers, well, many of them had to go on disability. Many of the tech\nworkers quit as well, tired of being unproductive and overworked.

I could pontificate about how shareholder capitalism is at odds with\nlabor, but I\u2019ll leave that for another time. Instead, I\u2019ll talk about\nhow overstimulation via the internet is leading to an increase in\nmultitasking, crippling our one true superpower \u2013 the ability to\nconcentrate.

Lots has been said about increasing productivity in the workplace\n(it\u2019s been a hot topic for over a century now), but I\u2019ve noticed in\nmyself and others that we can\u2019t concentrate as well as we used to. We\nturn to our phones for diet entertainment, forgoing the relaxation and\nconcentration of past times.

Sustenance farmers did it best, I think. When I lived on a farm, the\nschedule was simple \u2013 wake up at 5 or so, before the sun rises to get\nwashed up, eat a small meal, and dressed to tackle the day\u2019s work.

You\u2019d work slowly, maybe handling some maintenance tasks like setting\nup a net so the animals don\u2019t eat your crops, or patching holes in rice\npaddies so those dang beavers don\u2019t drain all the water from your patch.\nIn the summer, it\u2019d get too hot by 11 or so, so you\u2019d retire back home,\neat a small lunch, and take a nap \u2013 four, five hours.

After that, you\u2019d set out again after the heat cooled off, finishing\noff your tasks for the day, maybe leaving some tools outside for\ntomorrow, taking note of what needs to be done, before retiring to eat\ndinner, wash up and sleep for the next day.

Sometimes you needed your plots to lay fallow, or sometimes the other\npeople in your village would share tools, or maybe someone needed to be\na substitute teacher for the little ones because the only teacher in\ntown had a family emergency. You wouldn\u2019t think of things in terms of\nmoney \u2013 you\u2019d repay each other with favors and do the right thing (most\nof the time). You could borrow someone else\u2019s plot that hasn\u2019t been used\nin a while, maybe in exchange for something like crops or tools or a\nservice, like fixing up their house\u2019s roof. We\u2019d congregate at the town\ncenter, asking for favors, doing favors, and building up the shared\ncommunity. People were happy, for the most part.

Most of us used to lead that kind of life, until the businesspeople\nlured us over to factories with the promise of wages and\nbenefits, promising us milk and honey but leaving us with\nregrets and sadness.

Thanks for nothing, capitalists.

Industrialization was one game changer, but the internet was another.\nInstead of having a set of town centers, one for each town or\nthereabouts, we can now have a town center in our own houses (or pocket,\nwith cell phones).

With this kind of super power, we should be able to use it to our own\nbenefit. And we did, kind of. There\u2019s lots of nice services these days\nthat make our lives easier, but they don\u2019t make our lives more\nfulfilling.

And in exchange, we\u2019ve lost our ability to focus, robbing us of one\nof the paths to fulfillment.

We\u2019ve all become part of the hivemind.

It\u2019s impossible to have all of us disconnect. There will only be a\nfew people who want to build their own house off the grid and grow their\nown crops and maintain a solar panel for electricity.

For the rest of us, we can only set boundaries \u2013 but I implore you to\nthink about the agora model \u2013 where we only go to town a few days out of\nthe week at most. We don\u2019t always have to be connected. A slower pace of\nlife isn\u2019t bad.

For me, as a programmer, that means downloading the resources I need\nto work offline. On my personal computer, I have a copy of the\ndocumentation of all the tools I work with, books, papers, and\nsyllabuses for courses I\u2019d like to learn from, and offline copies of\ninteresting websites that I could learn a lot from (mainly in the form\nof rss, but a recursive wget does the trick for sites that don\u2019t have an\nrss feed), copies of music I like listening to.

I only need to go online sometimes, since I\u2019ll always have something\nto enjoy.

One side effect is that you won\u2019t hate the airplane as much \u2013 I just\nopen up a paper I\u2019ve wanted to read, a textbook, a course, and jump\nright in. It\u2019s a wonderful way to pass the time and really feel\nproductive by the end.

Learn Computer Science

2022-06-11T19:50:29-05:00

\n\n

I have a confession to make. I, (and most of us) have been stuck on\nfad diets for CS. It\u2019s so nice to read easy books, like learn C++ in 21\ndays, or doing some leetcode problems. You learn a bit, feel that warm\nglow of getting smarter, and trick yourself into feeling productive.

You\u2019re not really getting any better, without learning the\nfundamentals.

I decided to cobble up a set of resources to go through, as a fusion\nof Teach Yourself Computer\nScience and Steve Yegge\u2019s\nrecommendations.

Teach Yourself CS has a detailed list of good resources for the more\npractical aspects of CS, whereas Steve Yegge\u2019s recommendations focus\nmore on the mathematical side \u2013 the first four courses he recommends\nare: discrete math, linear algebra, statistics, and theory of\ncomputation.

Steve Yegge\u2019s list omits Computer Architecture, which is a glaring\nomission \u2013 an Operating Systems course doesn\u2019t have enough time to cover\nall the interesting parts of concurrency, parallelism, and optimization\nthat a computer architecture course would.

Teach Yourself CS doesn\u2019t mention Theory of Computation, and is\nlighter on the math background, giving one resource for math. Theory of\nComputation is a bit more dated (swallowed up by all the other fields),\nbut is still useful for its applications.

To that end, I\u2019ve fused them both, and skimmed some resources to put\non this list.

This list is incomplete and changing all the time, but hey, isn\u2019t\nthat what agile development is all about?

Programming

Textbooks

Structure and Interpretation of Computer Programs

Courses

Papers

Projects

Discrete Math

Textbooks

Discrete Mathematics, an open introduction

Linear Algebra

Textbooks

No bullshit guide to linear Algebra, Savov

Statistics

Textbooks

Statistics fourth ed, Friedman et. al
Think stats, Downey
Think Bayes, Downey

Theory of Computation

Textbooks

Theory of Computation, Hefferon
Computational Complexity, Arora and Barak

Computer Architecture

Textbooks

Computer Architecture, a Programmers Perspective
Computer Architecture, Patterson and Hennessey

Resources

Some Assembly Required

Algorithms and Data Structures

Textbooks

The Algorithm Design Manual, Skiena
The Art of Multiprocessor programming

Operating Systems

Textbooks

Operating Systems, Three Easy Pieces

Networking

Textbooks

Computer Networks, a Systems approach

Projects

Chitcp

Databases

Textbooks

Database Internals

Projects

Chidb

Compilers

Textbooks

Crafting Interpreters

Resources

Chibicc

Distributed Systems

Textbooks

Designing Data Intensive Applications

Writing Compilers

2022-01-10T17:48:22-05:00

\n\n

In his post \u201cRich Programmer Food\u201d, Steve Yegge explains why you\nshould learn compilers:

\n
If you don\u2019t know how compilers work, then you don\u2019t know how\ncomputers work. If you\u2019re not 100% sure whether you know how compilers\nwork, then you don\u2019t know how they work.
\n

Throughout this post, Steve has some witticisms and some harsh\nrealizations:

\n
If you don\u2019t take compilers then you run the risk of forever being on\nthe programmer B-list: the kind of eager young architect who becomes a\nsaturnine old architect who spends a career building large systems and\nbeing damned proud of it.
\n

While I wouldn\u2019t say the article wholly convinced me\nto learn about compilers, I do agree that compilation problems are\neverywhere.

In front-end work, where I started off, there\u2019s a glut of frameworks.\nReact, Angular, Vue, Svelte, with some tooling like webpack, parcel,\nesbuild, babel and browserify.

What do they all have in common? They\u2019re all compilers. React takes\nJSX and turns it into HTML and JS.

Angular launched Typescript, which is a language that compiles to\nJavascript.

Vue has .vue templates, which compile to HTML, CSS, and\nJS.

Svelte has its own .svelte files.

All the other build tools I mentioned take javascript, minify it,\ntree-shake it (dead-code elimination) and bundle it for you so it can be\nserved on the internet.

All of these are compiler problems.

My favorite language, Rust, has a great deal of compiler work in the\ncore language and design tradeoffs to make it easy for new people to\nadopt and experienced people to enjoy.

In Mobile, Kotlin and Swift, as well as many libraries, all reduce to\ncompiler problems \u2013 they end up manipulating ASTs to produce better\ncode, or compile to some bytecode that is executed on the target\nplatform.

Compiler problems really are everywhere.

Here\u2019s another quote from Ras Bodik:

\n
Don\u2019t be a boilerplate programmer. Instead, build tools for users and\nother programmers. Take historical note of textile and steel industries:\ndo you want to build machines and tools, or do you want to operate those\nmachines?
\n

Got the message? Compilers are really important. Or so I think. But\nhow does one learn compilers? Well, let\u2019s scour the internet.

A Plan of Attack

There\u2019s no shortage of great materials on compilers on the internet,\nbut I want to focus on three resources I\u2019m currently using to learn\ncompilers, since I think they run the gamut: One\u2019s a great starting\nresource, another is a great medium level resource, and one is extremely\nhard but rewarding.

The Resources:

Compiling to Assembly from Scratch

Compiling to Assembly from Scratch is the first resource I looked at\nto start my compiler writing journey. It\u2019s a short book on how to write\na small ARM32 emitting compiler in Typescript. It does this by parsing\nusing a parser combinator, and then emitting simple ARM32 code using the\nvisitor pattern.

Parser combinators are a technique for parsing that\u2019s been picking up\nsteam recently. Because Typescript has first-class regex, this technique\nreally fits well here.

The book also goes over basic ARM32 instructions, and at the end of\nthis pretty short book (~200 pages), you have a working compiler that\nturns a subset of javascript into ARM32.

Worth every penny.

Crafting Interpreters

Crafting Interpreters is a great book \u2013 the first half of the book\ncovers writing a tree-walk interpreter in Java, while the second half of\nthe book involves writing a bytecode VM in C for a non-trivial language\ncalled \u201cLox\u201d.

Bob Nystrom really knows his stuff \u2013 the prose is clean, and every\nline of code written is well explained.

That being said, for the first part, I didn\u2019t really want to write\nany Java (sorry Oracle), so I found a transcription of the first part of\nthe book\u2019s code in Rust https://github.com/jeschkies/lox-rs and used that code\nas the basis for the first part of the book.

At the end of the first part of the book, I really felt as though I\ngot the hang of the basics of compiler writing.

I still need to go through the second part, but I\u2019m really enjoying\nit so far!

ChibiCC

This resource is a bit different: It\u2019s a git repo by the creator of\nmold (the new LLVM linker) to create a C Compiler (CC) in C.

This repository follows the paper \u201cAn Incremental Approach to\nCompiler Construction\u201d by Ghuloum, http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf which\nadvocates writing a simple compiler step by step and adding\nfunctionality little by little.

Rui Ueyama, the author of this repo, writes clean C code for every\ncommit, and each message details adding a new feature to the C compiler.\nThere\u2019s no accompanying instructional material, so you\u2019re on your own to\nread the diffs and try to ascribe meaning to them, but it is important\nto read lots of code, and what better way to start than in such a\nstructured format?

Conclusion

After going through a few resources for compiler construction, I\u2019m\nstarting to get a better understanding of compilers, and seeing those\nkinds of problems crop up all the time. Writing compilers has also\nmotivated me to read more code, and I\u2019m hoping to be able to read code\nfrom larger projects to write better code in the future!

Writing Portable C

2021-11-15T18:06:09-05:00

\n\n

On my free time I\u2019ve been hacking away on unix-utils, a\nproject that implements some common unix utilities. In order to test how\nportable C really is for the average programmer, I decided to lean on\nopen source to see how many platforms I could target my C with. I wanted\nto write the most portable code I could (that meant targeting only\nPOSIX) functions and using a minimalistic stdlib that tried its best to\nadhere to the standards (musl). This let me write C code for about 50\ntargets. Let\u2019s go deeper into the background and how that was all\npossible:

POSIX and SUS

In the 70s, C was made at AT&T. In the 80s, it escaped, becoming\nmore popular, before eventually becoming standardized by ANSI (C89). But\nwhat about standards for Operating Systems? Enter POSIX (the Portable\nOperating System Interface), a standard for operating system interfaces.\nIn 1988, the first version of POSIX was released by the IEEE, which\ndetailed common interfaces, like Signals, pipes, and the C standard\nlibrary.

The POSIX standards were fairly minimalistic in the 80s, adding some\nextensions (like real time programming) and thread extensions) in the\nearly 90s before being subsumed by the Austin Group, a committee that\ndesigned the Single Unix Specification. The Austin Group has steered the\nPOSIX standards since 1997, creating standards like (POSIX 1997/SUS v2),\n(POSIX 2001/SUS v3), and (POSIX 2008/SUS v4).

Since 2008, there have been two minor corrections to the POSIX\nstandards (one in 2011 and 2017), but the two most common POSIX\nstandards in use are POSIX 2001 and 2008, which is where we\u2019ll be\ndirecting most of our attention to.

POSIX compliance in particular ends up being extremely important,\nbecause most Operating Systems have at least some level of POSIX\ncompliance. Linux, the BSDs, Mac OS, and Windows all do to some extent.\nThat means that our C code can target all of them by following the\nstandards, which makes our code more flexible.

This is so important that GCC (GNU\u2019s C Compiler) ended up\nimplementing a flag that checks for strict compliance to the POSIX\nstandard of your choice.

In my Makefile, I have this line, which says to compile my code\nstrictly according to the standard.

CFLAGS = -std=c99 -D_POSIX_C_SOURCE=200809L

Since I wrote the first draft of my utilities on a Mac OS computer\nwith no POSIX compatibility flags, you can imagine there was a lot of\nbreakage. As to why there was so much breakage, well, that requires\nanother history lesson.

GCC vs MUSL

In the 80s, the Free Software Foundation (FSF) wanted to create the\nideal \u201cFree\u201d programming environment. To do so, they were going to start\nfrom the top-down, by implementing the user space (a C compiler, a\nshell, the POSIX shell utilities, etc), and then build an OS kernel (GNU\nHurd). GNU succeeded at one part of their mission, by providing the most\ncommon userspace tools to date (GCC, Bash, and the GNU utils). However,\nGNU\u2019s kernel lost out to Linux, and the rest is history.

Linux started out only supporting GCC tools for its userspace, but\nnow it can support a wide variety of C standard libraries (libc for\nshort). One of those ends up being Musl, the standard library of this\narticle.

The choice of standard library would end up being entirely\ninconsequential if not for one detail: Musl supports static linking, and\nGCC does not.

Sure GCC supports a lot of non-standard extensions, and sure GCC\nexecutables are more bloaty than their musl counterparts, but static\nlinking lets us execute our binaries without having installed a libc on\nthe platform.

That means our code can reach even more users!

Much blood has been spilt on static vs dynamic linking, so I will\nspare you the carnage by simply saying that static linking tends to be\nmore convenient for the end user (they require less dependencies on\ntheir side to run the code), which is good for us, the application\nbuilders.

What Sacrifices were made?

How do you make static binaries?

Going back to building some unix utilities, I downloaded a musl-gcc\ncompiler, logged into my linux VM and started compiling.

The first issue I ran into was that musl-gcc didn\u2019t compile static\nbinaries.

I added the flag -static to my build, but\nfile and ldd ended up telling me that my\nbinary was still dynamically linked.

I dug through troves of documentation. Eventually, I discovered that\nit wasn\u2019t enough just to provide the -static flag, because\nGCC can ignore it. You have to provide another flag,\n--static as well. Oh, and if that wasn\u2019t enough, that still\ndidn\u2019t compile static binaries. You had to disable pie, or\nposition independent executables with the flag\n-no-pie as well.

Finally, I had compiled a hello world binary statically. Time to move\non!

Don\u2019t name your functions `_init`

I then tried to compile my utilities. I wanted to decrease\nduplication so I wrote a header file with a function called\n_init. This ended up causing a duplicate symbol error (musl\ndefines this function in crti.o first).

Of course, GCC never complained, so I had to rename this\nfunction.

`getopt_long` doesn\u2019t exist

Next up, getopt_long (Get options with long flags) isn\u2019t\nPOSIX standard. Unshocking. POSIX only specifies the normal\ngetopt, which supports short options only. Long options\nlike --file or --color are a GNUism.

I ended up finding a copy of getopt_long online and\nrewriting my header file includes for my utilities.

Sysctl isn\u2019t standard

Next up, I had a compiler error where my implementation of\nuptime failed to compile. <sys/sysctl.h>\nis a Macism, and not part of POSIX. Linux offers it up in\n<linux/sysctl.h> for convenience, but as its name\nmight indicate, it\u2019s not portable.

Next!

lstat has optional fields

In my implementation of stat, I used the functions\nmajor, minor and ctime, none of\nwhich are POSIX compliant. They\u2019re useful on mac os, so I can gate them\nbehind an __APPLE__ macro, but that makes the code less\nsuccinct. Oh well.

`NI_MAXHOST` isn\u2019t defined

As an oddity, musl doesn\u2019t define NI_MAXHOST at all.\nThis is useful for dig, which returns the ip address for a\ngiven address. I ended up defining it if it wasn\u2019t already defined.

Getting the Toolchains

With all these changes made, our code will now compile for Linux +\nmusl, thankfully. The next problem was actually getting our\ntoolchains.

Luckily, after some googling, I found out about <musl.cc>, a\nwebsite which releases versions of musl-gcc toolchains.

Now, since I didn\u2019t want to create an undue amount of load onto this\nwebsite, I created a mirror of it: https://github.com/Takashiidobe/muslcc.

Next, I had to create a github action that would fetch the compiler\nrequired, set it up properly, compile all of the binaries, strip the\ndebug information, tar them into one directory, and release them on a\npush to tags. Phew!

This part turned out to be a lot of guesswork and letting it run, so\nI\u2019ll leave the final results here:

https://github.com/Takashiidobe/unix-utils/blob/master/.github/workflows/release.yml

And the repo here:

https://github.com/Takashiidobe/unix-utils

In Short

It\u2019s amazing that you can write code that targets so many\narchitectures, and compile to them easily, all for free, with the power\nof open source (and Microsoft\u2019s wallet, thanks Github Actions).

With this, I was able to build for 52 architectures and release code\nfor them (I ended up adding in support for x86_64 Darwin and arm64\nDarwin).

Viva portable code.

Building Rust binaries for different platforms

2021-11-03T09:24:46-05:00

\n\n

Rust has great support for cross compilation, with\ncross, you can install the required c toolchain + linker\nand cross compile your rust code to a binary that runs on your targeted\nplatform. Sweet!

If you\u2019d like to look at the code and results, it\u2019s in this repo\nhere: https://github.com/Takashiidobe/rust-build-binary-github-actions

Rust library writers use this feature to build and test for other\nplatforms than their own: hyperfine\nfor example builds for 11 different platforms.

The rustc\nbook has a page on targets and tiers of support. Tier 1 supports 8\ntargets:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Tier 1
aarch64-unknown-linux-gnu
i686-pc-windows-gnu
i686-unknown-linux-gnu
i686-pc-windows-msvc
x86_64-apple-darwin
x86_64-pc-windows-gnu
x86_64-pc-windows-msvc
x86_64-unknown-linux-gnu

Tier 2 with Host tools supports 21 targets.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Tier 2
aarch64-apple-darwin
aarch64-pc-windows-msvc
aarch64-unknown-linux-musl
arm-unknown-linux-gnueabi
arm-unknown-linux-gnueabihf
armv7-unknown-linux-gnueabihf
mips-unknown-linux-gnu
mips64-unknown-linux-gnuabi64
mips64el-unknown-linux-gnuabi64
mipsel-unknown-linux-gnuabi
powerpc-unknown-linux-gnu
powerpc64-unknown-linux-gnu
powerpc64le-unknown-linux-gnu
riscv64gc-unknown-linux-gnu
s390x-unknown-linux-gnu
x86_64-unknown-freebsd
x86_64-unknown-illumos
arm-unknown-linux-musleabihf
i686-unknown-linux-musl
x86_64-unknown-linux-musl
x86_64-unknown-netbsd

Let\u2019s try to build a binary for all 29 targets.

A Note on Targets

The Rust RFC for Target support: https://rust-lang.github.io/rfcs/0131-target-specification.html

A target is defined in three or four parts:

$architecture-$vendor-$os-$environment

The environment is optional, so some targets have three parts and\nsome have four.

Let\u2019s take x86_64-apple-darwin for example.

x86_64 is the architecture
apple is the vendor
darwin is the os

You\u2019ll notice here that there is no $environment. This\ntarget assumes the environment, which is most likely to be\ngnu.

Let\u2019s take one with four parts:\ni686-pc-windows-msvc.

i686 is the architecture
pc is the vendor
windows is the os
msvc is the environment

In this target, the environment is specified as msvc,\nthe microsoft C compiler. This is the most popular compiler for windows,\nbut it need not be: if you look in the same tier 1 table, there\u2019s this\ntarget: i686-pc-windows-gnu.

The only thing that\u2019s changed is the environment is now\ngnu. Windows can use gcc instead of\nmsvc, so building for this target uses the gcc\ninstead of msvc.

Architectures

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Architecture	Notes
aarch64	ARM 64 bit
i686	Intel 32 bit
x86_64	Intel 64 bit
arm	ARM 32 bit
armv7	ARMv7 32 bit
mips	MIPS 32 bit
mips64	MIPS 64 bit
mips64el	MIPS 64 bit Little Endian
mipsel	MIPS 32 bit Little Endian
powerpc	IBM 32 bit
powerpc64	IBM 64 bit
rsicv64gc	RISC-V 64 bit
s390x	IBM Z 32 bit

Vendors

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Vendor	Notes
pc	Microsoft
apple	Apple
unknown	Unknown

Operating Systems

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Operating System	Notes
darwin	Apple\u2019s OS
linux	Linux OS
windows	Microsoft\u2019s OS
freebsd	FreeBSD OS
netbsd	NetBSD OS
illumos	Illumos OS, a Solaris derivative

Environments

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

Environment	Notes
musl	Musl C library
gnu	GNU\u2019s C library
msvc	Microsoft Visual C library
freebsd	FreeBSD\u2019s C library
netbsd	NetBSD\u2019s C library
illumos	Illumos\u2019 C library

When you go to the releases tab to download a particular binary,\nyou\u2019ll need to know these four things to download a binary that runs on\nyour system.

Now, let\u2019s start building for all these systems.

Building Binaries for ~30 Targets

We\u2019re going to use Github Actions, a task runner on github.com to\nbuild our binaries. Our binary is a simple hello world\nbinary.

If you\u2019d just like to look at the github actions file, it\u2019s located\nhere: https://github.com/Takashiidobe/rust-build-binary-github-actions/blob/master/.github/workflows/release.yml

Conceptually, we\u2019d like to do the following:

Set up our target environments.
Download the C compiler (environment) we need.
Download a docker image of the OS we require.
Download the rust toolchain onto docker container.
Build the binary.
Optionally strip debug symbols.
Publish it to the github releases tab.

We\u2019ll first start out by defining our github action and setting up\nthe target environments:

name: release\n\nenv:\n  MIN_SUPPORTED_RUST_VERSION: "1.56.0"\n  CICD_INTERMEDIATES_DIR: "_cicd-intermediates"\n\non:\n  push:\n    tags:\n      - '*'\n\njobs:\n  build:\n    name: ${{ matrix.job.target }} (${{ matrix.job.os }})\n    runs-on: ${{ matrix.job.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        job:\n          # Tier 1\n          - { target: aarch64-unknown-linux-gnu      , os: ubuntu-20.04, use-cross: true }\n          - { target: i686-pc-windows-gnu            , os: windows-2019                  }\n          - { target: i686-unknown-linux-gnu         , os: ubuntu-20.04, use-cross: true }\n          - { target: i686-pc-windows-msvc           , os: windows-2019                  }\n          - { target: x86_64-apple-darwin            , os: macos-10.15                   }\n          - { target: x86_64-pc-windows-gnu          , os: windows-2019                  }\n          - { target: x86_64-pcwindows-msvc          , os: windows-2019                  }\n          - { target: x86_64-unknown-linux-gnu       , os: ubuntu-20.04                  }\n          # Tier 2 with Host Tools\n          - { target: aarch64-apple-darwin           , os: macos-11.0                    }\n          - { target: aarch64-pc-windows-msvc        , os: windows-2019                  }\n          - { target: aarch64-unknown-linux-musl     , os: ubuntu-20.04, use-cross: true }\n          - { target: arm-unknown-linux-gnueabi      , os: ubuntu-20.04, use-cross: true }\n          - { target: arm-unknown-linux-gnueabihf    , os: ubuntu-20.04, use-cross: true }\n          - { target: armv7-unknown-linux-gnueabihf  , os: ubuntu-20.04, use-cross: true }\n          - { target: mips-unknown-linux-gnu         , os: ubuntu-20.04, use-cross: true }\n          - { target: mips64-unknown-linux-gnuabi64  , os: ubuntu-20.04, use-cross: true }\n          - { target: mips64el-unknown-linux-gnuabi64, os: ubuntu-20.04, use-cross: true }\n          - { target: mipsel-unknown-linux-gnu       , os: ubuntu-20.04, use-cross: true }\n          - { target: powerpc-unknown-linux-gnu      , os: ubuntu-20.04, use-cross: true }\n          - { target: powerpc64-unknown-linux-gnu    , os: ubuntu-20.04, use-cross: true }\n          - { target: powerpc64le-unknown-linux-gnu  , os: ubuntu-20.04, use-cross: true }\n          - { target: riscv64gc-unknown-linux-gnu    , os: ubuntu-20.04, use-cross: true }\n          - { target: s390x-unknown-linux-gnu        , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-freebsd         , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-illumos         , os: ubuntu-20.04, use-cross: true }\n          - { target: arm-unknown-linux-musleabihf   , os: ubuntu-20.04, use-cross: true }\n          - { target: i686-unknown-linux-musl        , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-linux-musl      , os: ubuntu-20.04, use-cross: true }\n          - { target: x86_64-unknown-netbsd          , os: ubuntu-20.04, use-cross: true }

Checking out our code:

steps:\n - name: Checkout source code\n uses: actions/checkout@v2

Downloading the C compiler

Most of the time, the C compiler we need is already installed, but in\nsome cases it\u2019ll be overriden by another compiler.

We\u2019ll need to download the correct suitable C compiler in that case:\n(i686-pc-windows-gnu has gcc, but it\u2019s not on the $PATH).

    - name: Install prerequisites\n      shell: bash\n      run: |\n        case ${{ matrix.job.target }} in\n          arm-unknown-linux-*) sudo apt-get -y update ; sudo apt-get -y install gcc-arm-linux-gnueabihf ;;\n          aarch64-unknown-linux-gnu) sudo apt-get -y update ; sudo apt-get -y install gcc-aarch64-linux-gnu ;;\n          i686-pc-windows-gnu) echo "C:\\msys64\\mingw32\\bin" >> $GITHUB_PATH\n        esac

Installing the Rust toolchain

    - name: Install Rust toolchain\n      uses: actions-rs/toolchain@v1\n      with:\n        toolchain: stable\n        target: ${{ matrix.job.target }}\n        override: true\n        profile: minimal # minimal component installation (ie, no documentation)

Building the executable

    - name: Build\n      uses: actions-rs/cargo@v1\n      with:\n        use-cross: ${{ matrix.job.use-cross }}\n        command: build\n        args: --locked --release --target=${{ matrix.job.target }}

Stripping debug information from binary

    - name: Strip debug information from executable\n      id: strip\n      shell: bash\n      run: |\n        # Figure out suffix of binary\n        EXE_suffix=""\n        case ${{ matrix.job.target }} in\n          *-pc-windows-*) EXE_suffix=".exe" ;;\n        esac;\n        # Figure out what strip tool to use if any\n        STRIP="strip"\n        case ${{ matrix.job.target }} in\n          arm-unknown-linux-*) STRIP="arm-linux-gnueabihf-strip" ;;\n          aarch64-pc-*) STRIP="" ;;\n          aarch64-unknown-*) STRIP="" ;;\n          armv7-unknown-*) STRIP="" ;;\n          mips-unknown-*) STRIP="" ;;\n          mips64-unknown-*) STRIP="" ;;\n          mips64el-unknown-*) STRIP="" ;;\n          mipsel-unknown-*) STRIP="" ;;\n          powerpc-unknown-*) STRIP="" ;;\n          powerpc64-unknown-*) STRIP="" ;;\n          powerpc64le-unknown-*) STRIP="" ;;\n          riscv64gc-unknown-*) STRIP="" ;;\n          s390x-unknown-*) STRIP="" ;;\n          x86_64-unknown-freebsd) STRIP="" ;;\n          x86_64-unknown-illumos) STRIP="" ;;\n        esac;\n        # Setup paths\n        BIN_DIR="${{ env.CICD_INTERMEDIATES_DIR }}/stripped-release-bin/"\n        mkdir -p "${BIN_DIR}"\n        BIN_NAME="${{ env.PROJECT_NAME }}${EXE_suffix}"\n        BIN_PATH="${BIN_DIR}/${BIN_NAME}"\n        TRIPLET_NAME="${{ matrix.job.target }}"\n        # Copy the release build binary to the result location\n        cp "target/$TRIPLET_NAME/release/${BIN_NAME}" "${BIN_DIR}"\n        # Also strip if possible\n        if [ -n "${STRIP}" ]; then\n          "${STRIP}" "${BIN_PATH}"\n        fi\n        # Let subsequent steps know where to find the (stripped) bin\n        echo ::set-output name=BIN_PATH::${BIN_PATH}\n        echo ::set-output name=BIN_NAME::${BIN_NAME}

And uploading to Github

    - name: Create tarball\n      id: package\n      shell: bash\n      run: |\n        PKG_suffix=".tar.gz" ; case ${{ matrix.job.target }} in *-pc-windows-*) PKG_suffix=".zip" ;; esac;\n        PKG_BASENAME=${PROJECT_NAME}-v${PROJECT_VERSION}-${{ matrix.job.target }}\n        PKG_NAME=${PKG_BASENAME}${PKG_suffix}\n        echo ::set-output name=PKG_NAME::${PKG_NAME}\n        PKG_STAGING="${{ env.CICD_INTERMEDIATES_DIR }}/package"\n        ARCHIVE_DIR="${PKG_STAGING}/${PKG_BASENAME}/"\n        mkdir -p "${ARCHIVE_DIR}"\n        mkdir -p "${ARCHIVE_DIR}/autocomplete"\n        # Binary\n        cp "${{ steps.strip.outputs.BIN_PATH }}" "$ARCHIVE_DIR"\n        # base compressed package\n        pushd "${PKG_STAGING}/" >/dev/null\n        case ${{ matrix.job.target }} in\n          *-pc-windows-*) 7z -y a "${PKG_NAME}" "${PKG_BASENAME}"/* | tail -2 ;;\n          *) tar czf "${PKG_NAME}" "${PKG_BASENAME}"/* ;;\n        esac;\n        popd >/dev/null\n        # Let subsequent steps know where to find the compressed package\n        echo ::set-output name=PKG_PATH::"${PKG_STAGING}/${PKG_NAME}"\n    - name: "Artifact upload: tarball"\n      uses: actions/upload-artifact@master\n      with:\n        name: ${{ steps.package.outputs.PKG_NAME }}\n        path: ${{ steps.package.outputs.PKG_PATH }}\n\n    - name: Check for release\n      id: is-release\n      shell: bash\n      run: |\n        unset IS_RELEASE ; if [[ $GITHUB_REF =~ ^refs/tags/v[0-9].* ]]; then IS_RELEASE='true' ; fi\n        echo ::set-output name=IS_RELEASE::${IS_RELEASE}\n    - name: Publish archives and packages\n      uses: softprops/action-gh-release@v1\n      if: steps.is-release.outputs.IS_RELEASE\n      with:\n        files: |\n          ${{ steps.package.outputs.PKG_PATH }}\n      env:\n        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

And after building this github actions file, we find that\u2026 3 targets\nfail to build.

x86_64-unknown-freebsd,\nx86_64-unknown-illumos,\npowerpc-unknown-linux-gnu.

Luckily, the error message that cross provides gives us a clear\nindication of what to fix. Cross does not provide a proper image, so it\ngets confused, defaults to the toolchain it\u2019s running on (ubuntu 20.04),\nand the linker cannot find the proper libraries required. Easy to fix:\nAdd a Cross.toml file to the root of the project with\ndocker images for the particular targets, and build again.

[target.x86_64-unknown-freebsd]\nimage = "svenstaro/cross-x86_64-unknown-freebsd"\n\n[target.powerpc64-unknown-linux-gnu]\nimage = "japaric/powerpc64-unknown-linux-gnu"

You\u2019ll notice that illumos is missing here \u2013 I couldn\u2019t find a\nsuitable docker image to build it on docker hub, so I gave up. If you\nfind one, let me know and i\u2019ll update this article.

Results

Out of the 29 architectures provided in Tier 1 and Tier 2 with host\ntools, it was easy enough to build a binary for 28 architectures (We\nonly need a solaris/illumos docker image to build for the last one).

That\u2019s pretty good, given that this only took a couple of hours to\ntest out. I hope Rust continues to support this many architectures into\nthe future, and Github Actions keeps being a good platform to make\nreleases for.

If you\u2019d like to take the repo for yourself to build rust binaries on\nreleases for 28 architectures, feel free to clone/fork the repo here: https://github.com/Takashiidobe/rust-build-binary-github-actions

Offline e-mail in the terminal

2021-10-27T09:44:59-05:00

\n\n

I like putting stuff in the terminal. Let\u2019s roll back the clock 30\nyears and go back to terminal e-mail.

Let\u2019s start with installing an e-mail client.

I like aerc. I also use mac OS,\nso it can be installed with a simple brew install aerc.

Setting up Aerc

Here\u2019s another guide for that: Text based\ngmail

I personally use a gmail, so I had to create a gmail app password to\nprovide to aerc.

On first startup, aerc has a startup wizard that helps you set up\nyour account. Nice! Put in your information and enjoy e-mail in the\nterminal.

My aerc/accounts.conf looks something like this:

[Personal]\nsource        = imap://me@gmail.com:password@imap.gmail.com:993\noutgoing      = smtp+plain://me@gmail.com:$APP_PASSWORD_HERE@smtp.gmail.com:587\nsmtp-starttls = yes\nfrom          = Me <me@gmail.com>\ncopy-to       = Sent

As well, I wanted to change up some of the defaults:

I set my aerc/aerc.conf like so: this sets my pager to\nbat instead of less -R, and prefers to display\nthe HTML portion of an email first if possible, then falling back to the\nplain text version.

To be able to read HTML e-mail, I uncommented the line for text/html,\nand use the html filter that\u2019s provided by aerc. This requires\nw3m and dante, so I brew installed both:

brew install w3m\nbrew install dante

[viewer]\npager=/usr/local/bin/bat\nalternatives=text/html,text/plain\n\n[filters]\nsubject,~^\\[PATCH=awk -f @SHAREDIR@/filters/hldiff\ntext/html=bash /usr/local/share/aerc/filters/html\ntext/*=awk -f /usr/local/share/aerc/filters/plaintext

Great! Now we\u2019re all set up with aerc.

Offline Support

This is great and all, but try to run aerc without\ninternet connection. It hangs. That\u2019s not acceptable! Let\u2019s fix\nthat.

Drew DeVault, the original author of aerc published a\nguide on making aerc work offline https://drewdevault.com/2021/05/17/aerc-with-mbsync-postfix.html.\nWe\u2019ll follow this guide a bit, but I use gmail instead of\nmigadu, and ended up using msmtp instead of\npostfix, so there\u2019ll be a few changes.

Mbsync for reading e-mail offline

Let\u2019s start off installing mbsync. On Mac OS it is\nlisted as its previous name, isync. So run\nbrew install isync to install it.

We\u2019ll then set it up \u2013 the config file is at\n~/.mbsyncrc, so create that and fill it with this:

IMAPStore gmail-remote\nHost imap.gmail.com\nAuthMechs LOGIN\nUser you@gmail.com\nPass $APP_PASSWORD_HERE\nSSLType IMAPS\n\nMaildirStore gmail-local\nPath ~/mail/gmail/\nInbox ~/mail/gmail/INBOX\nSubfolders Verbatim\n\nChannel gmail\nFar :gmail-remote:\nNear :gmail-local:\nExpunge Both\nPatterns * !"[Gmail]/All Mail" !"[Gmail]/Important" !"[Gmail]/Starred" !"[Gmail]/Bin"\nSyncState *

If you don\u2019t already have a ~/mail/gmail/INBOX folder,\ncreate it with mkdir -p ~/mail/gmail/INBOX.

Now, if you run mbsync gmail, all of your e-mail will be\nsynced to your ~/mail/gmail folder.

Now, we just need aerc to pull locally instead of from gmails\nservers.

Go back to aerc/accounts.conf, and edit the source under\nthe [Personal] tag to point to maildir://~/mail. This will\nlet aerc read your e-mail locally instead of from gmail\u2019s servers.

As well, set the default to gmail/INBOX to land in your\ninbox folder on start.

[Personal]\nsource        = maildir://~/mail\noutgoing      = smtp+plain://me@gmail.com:$APP_PASSWORD_HERE@smtp.gmail.com:587\ndefault       = gmail/INBOX\nsmtp-starttls = yes\nfrom          = Me <me@gmail.com>\ncopy-to       = Sent

Turn off your internet and run aerc. Now you can read\nyour e-mail offline! We\u2019ll want to always keep our mailbox in sync, so\nwe\u2019ll want to run mbsync frequently to keep our mailbox in\nsync.

First, we\u2019ll need a program called chronic, which is\nprovided in moreutils. Download it with\nbrew install moreutils.

Run crontab -e to edit your local crontab, and put this\nin it.

This will have cron execute mbsync gmail every minute,\nkeeping your mailbox in sync with google\u2019s servers.

MAILTO=""\nPATH=YOUR_PATH_HERE\n* * * * * chronic mbsync gmail

Sending E-mail offline

If you try to send e-mail while offline on aerc currently, the e-mail\nwill never send. What we\u2019d like is some queue where the e-mail is sent\nimmediately if we\u2019re online, otherwise, to save that message in a queue,\nand send out all messages immediately as we regain connectivity.

We\u2019ll use msmtp for that.

Install it with brew install msmtp.

msmtp\u2019s config file is called ~/.msmtprc. Fill that file\nwith this:

defaults\ntls on\n\naccount gmail\nauth on\nhost smtp.gmail.com\nport 587\nuser me\nfrom me@gmail.com\npassword APP_PASSWORD_HERE\n\naccount default: gmail

Now we can send e-mail from the command line. This isn\u2019t super useful\nyet, since aerc has this functionality already. Next, we need to\nimplement the queueing capability we discussed. You\u2019ll want to download\ntwo bash scripts that do this for us: msmtpq and\nmsmtp-queue. These can be found here: https://github.com/tpn/msmtp/tree/master/scripts/msmtpq.\nMake them executable and place them somewhere on your path (I chose\n/usr/local/bin). This implements the queueing that be\ndiscussed.

Finally, we\u2019ll have to hook up aerc to use this\ncapability in accounts.conf.

[Personal]\nsource        = maildir://~/mail\noutgoing      = /usr/local/bin/msmtpq\ndefault       = gmail/INBOX\nsmtp-starttls = yes\nfrom          = Me <me@gmail.com>\ncopy-to       = Sent

Finally, we\u2019ll want to be able to execute the queueing functionality\nof msmtpq every minute as well. Edit your crontab to look\nlike this:

MAILTO=""\nPATH=YOUR_PATH_HERE\n* * * * * chronic mbsync gmail\n* * * * * chronic msmtp-queue -r

And with that, we\u2019re done! We can now read e-mail offline, which\nsyncs every minute when online, and send e-mail offline, which will get\nqueued, and sent as soon as we\u2019re back online again.

Work Offline

2021-10-07T20:43:45-05:00

\n\n

In my last post, I discussed the tradeoffs of various languages with\nregards to software longevity \u2013 I wanted to pick a language to use that\nwould make long lasting software.

In this post, I want to discuss working offline \u2013 why that\ninterleaves with the choice of language to use, and how to use tooling\nto make that accessible.

Joe Nelson, of PostgREST fame, wrote a series of posts that resonated\nwith me \u2013 starting with \u201cGoing\n\u2018Write Only\u2019\u201d where he starts off with quoting Joey Hess, a person\nwho \u201cLives in a cabin and programs Haskell on a drastically\nunder-powered netbook\u201d, where he harvests all of his electricity from\nthe sun, and works using a distributed workflow, much like a git\nworkflow.

He then goes on to explain his motivation of going \u201cwrite-only\u201d\nthusly:

\n
These people\u2019s thoughts are not idle for me. They contain a reproach,\na warning that one can be very busy and yet do unproductive things,\nhamartia. I want to focus on doing the right thing. Actually focus is\nthe wrong word. Focusing my thoughts would imply the same thoughts but\nsharper, whereas I want to change the way I think.
\n

He then went on to publish more blog posts focused on creating\nsoftware that lasts:

Good\nbooks for deep hacks
Inside\nthe C Standard Library
Tips\nfor stable and portable Software
Dynamic\nlinking best practices

Which are all great reads, and inspired me to write this post about\nhow I work offline, and what I use to make that happen.

Tools of the Trade

Git

It\u2019s a no-brainer to use git for this kind of workflow. You can go\noffline for weeks at a time, hacking away at your branch, and when\nyou\u2019re back online, merge back to the main branch, fetch what you\nmissed, and go back to hacking away offline. Since you have a copy of\nall the history of the repo on your hard disk, if you need to look at\nchanges from the past, you can do just that. Git enables this workflow,\nwhile other centralized systems require constant connectivity.

Man Pages

Man pages (and info pages) predate the internet, so of course they\nwork well offline. Unfortunately for Mac, Man pages are pretty sparse\n(basically scavenged off of old BSD manuals), and they aren\u2019t always the\nmost descriptive when it comes to using cli tools, so I tend to use\ntldr for that case.

To pay it forward, I tend to bundle man pages with utilities that I\nmake, like rvim or simplestats on cargo, even\nthough rust docs are more common in rust land.

Tldr

Ever remember how to use tar? Me neither. Tldr is a man\npages complement, offering examples for cli applications just by typing\ntldr $keyword. I personally really like it, because it\u2019s\nlike having a concise stackoverflow at your fingertips.

ZIM files

When I tell people about working offline, they ask \u201cbut what about X\nwebsite?\u201d. Well, if you want to look up a question on wikipedia or\nstackoverflow, you\u2019d surely need online access, right?

That\u2019s where ZIM files come in \u2013 offline archives of whole websites.\nThe kiwix project (sponsored by wikipedia) offers downloads and torrents\nfor ZIM files for sites like wikipedia and stackoverflow, which you can\nwholesale download over an internet connection, and then search through\nto your hearts content. So if you ever forget how to reverse a linked\nlist in your favorite language, you can search through stackoverflow to\nfind out.

You can also download wikipedia\u2019s own ZIM file archiver and archive\nother sites that you like.

DevDocs

Not all languages/projects have offline documentation, but most of\nthem have documentation on the web. DevDocs is a project that allows you\nto download and search through that documentation in a convenient way\noffline. Every time you get back online, you can sync the documentation\nof the projects you like to follow. Nifty.

E-books

E-Books are really great too, in both PDF and epub format. You can\nkeep a copy of them on your hard disk and search through them too\nwithout going to the internet.

Papers

Arxiv is a repository of open access\narticles in the sciences and maths. You can download papers off there\nfor free, and read them at your leisure. There\u2019s a treasure trove of\npapers to read!

Rustup + Cargo

Rust has a strong focus on offline work \u2013 cargo allows you to turn\ndocstring comments into HTML documentation with search, and\nrustup comes with its own offline documentation, by virtue\nof rustup docs.

Cargo also allows one to force offline mode by adding the\n--offline command line flag \u2013 this forces cargo to use\ndownloaded crates instead of going to crates.io.

Hardware

Currently, I work off of a Macbook pro 2017, which only has 128GB of\nhard disk space. If I want to supplant that with extra hard disk space,\nI have an external SSD that I carry around with me (with ZIM files for\noffline use). That being said, it is getting a bit old in 2022, and I\nmay replace it in the coming years for a framework laptop, as the company is very\ndevoted to right to repair.

Software that lasts Offline

2021-10-01T23:05:13-05:00

\n\n

It seems like every day software gets outdated. It\u2019s so hard to build\nsoftware that lasts for 3 years, let alone 30. Yet the houses we live in\nhave seen much longer lives, with just a bit of refurbishing here and\nthere. Why can\u2019t our software be the same way? _why the lucky stiff said\nsomething that resounds with me as a programmer as he left the\ninternet.

\n
To Program anymore was pointless.
\n
My programs would never live as long as the trial.
\n
A computer will never live as long as the trial.
\n
What if Amerika was only written for 32-bit power pc?
\n
Can an unfinished program be reconstructed??
\n
Can I write a program and go, \u201cAh, Well, You get the gist of it.\u201d
\n

Our software doesn\u2019t last as long as the written word. It\u2019s a very\ndefeating thing to think that most of our code doesn\u2019t last that long. I\nwondered if it had to be this way, or if it was something to do with how\nwe approached software writing. C code has lasted for a few decades, and\ncould last a few more \u2013 there\u2019s lots of COBOL and FORTRAN code in the\nwild that\u2019s 50 years old. Software that lasts is both high-level and\nlow-level at the same time \u2013 assembly would never last because it\u2019s tied\nto its platform.

A higher-level language need not be tied to any specific\narchitecture. Yet the need for compatibility with architectures, past,\npresent, and future makes it so the language must only build off of\nlow-level primitives. A contradiction.

I want a language that has offline documentation, that is robust, has\nwide compatibility in the past, present, and future, has standards, has\nmultiple implementations, is fast, and is easy to develop for. Here\u2019s a\nshort list of the languages I looked at, with some pros and cons for\neach in making software that lasts.

Javascript

Javascript is well specified (by committee), with multiple\nimplementations, and a strong commitment to backwards compatibility \u2013\nbut that\u2019s about where the pros end. Even though I coded professionally\nin Javascript for a few years, things about the language still trip me\nup \u2013 I sometimes forget to check for nulls, or I get different types\nthan what I\u2019m expecting when I use the standard library. As well, I know\nthose things will never be fixed, because the web is the most popular\nfor programming. Therefore, fundamentally, the language will never be\nable to smooth out its warts.

Typescript

Typescript smooths out most of the usability issues of\nJavascript, and gives it static typing, generics, and new constructs\n(enums, interfaces, types). It compiles to Javascript quickly, and\nconsidering how accessible Javascript is, it shares most of its\naccessibility pros. That being said, Typescript has more lax backwards\ncompatibility requirements, but with its compatibility to Javascript,\nwon\u2019t be able to fix the language\u2019s warts.

WASM

Web Assembly is a new contender for web language of the future\nTM. It\u2019s a minimalistic language with S-expressions (like\nlisps) and is meant to be an easy compiler target. Go, C, C++, Rust, and\nothers can compile down to it, targeting the WASM capabilities of the\nbrowser. As well, WASI seems like a portable way to run sandboxed\napplications in the future.

It\u2019s too low-level for productive use, but is an interesting foray\ninto fixing the kludge of the web.

Ruby

Ruby is the most OOP language I can think of \u2013 message passing,\neverything is an object, and GC pauses ad nauseum. It\u2019s a language with\na lot of expressiveness, and a lot of elegance. It has strong C\nbindings, so it has good interop with system libraries \u2013 and pretty good\nbackwards compatibility.

That being said, it\u2019s slow and clunky to write. The philosophy of\nexpressiveness means that everybody writes ruby code differently, and\nthere\u2019s a huge divide between ruby programmers, who are more restrictive\nwith what functionality they use, and rails programmers, who are more\nkeen on monkey-patching everything they can find for usability reasons.\nNot to say one side is right, but the language\u2019s stewardship has been on\nappeasing many camps, and that leads to fragmentation.

Python

Python is also very OOP, but contrary to ruby, it even comes with its\nown Zen, which you can read by entering import this at the\nREPL.

Beautiful is better than ugly. Explicit is better than implicit.\nSimple is better than complex. Complex is better than complicated. Flat\nis better than nested. Sparse is better than dense. Readability counts.\nThere should be one\u2013 and preferably only one \u2013obvious way to do it.

Python prefers fewer ways to do one thing, but that zen has been\nwearing off, with a huge standard library, which makes it hard to commit\nto backwards compatibility. Python has some system dependencies that are\nless than stellar when it comes to backwards compatibility, and the\nlarge surface area can make it hard for the language to keep stable.

Oh yeah, and remember Python3?

OCaml

OCaml is an interesting language; it has a bytecode interpreter,\ncross compilation, and compilation to native, just like Haskell. It\u2019s\nrelatively fast to compile, but has issues with backwards compatibility\nand footguns. As well, the standard library has been reimplemented by\nmany, including by Jane Street, twice (Base and Core).

It\u2019s a clear language with some baggage (Few people use the \u201cObject\nOriented\u201d or \u201cO\u201d features of \u201cOCaml\u201d). For loops and classes are often\nstruck down in code review as anti-patterns. Best to be functional, all\nthe time.

It is relatively fast, with a good ecosystem (Dune makes building\nOCaml apps pretty nice in 2021) but it\u2019s still a relatively small\necosystem, fighting against Haskell to become Typed Functional\nProgramming\u2019s main language.

Offline Documentation is great, and the standard library has few\nwants of its environment, but it lacks multicore support \u2013 in an\nincreasingly parallel world, that\u2019s a deal-breaker. It\u2019s looking like a\nreimplementation of the standard library with async might require a\nmajor version bump, to (5.X).

JVM languages (Java, Scala, Clojure)

JVM languages are pretty strong, with the JVM allowing users to\ntarget many platforms with their code (since the JVM runs on many\nthings). That being said, the JVM offers some penalty, because the\nruntimes can be quite hefty and difficult to port. It\u2019s not quite as\neasy as sending someone a binary and they can run it, or easy to\ncontainerize JVM apps, unlike those that offer native binaries.

CLR languages (C#, F#)

Same as the JVM languages, although I have to say that F# and C# are\npretty fun to program in.

Go

Go was the first serious language in the list I considered\nlearning \u2013 simple like C, with a strong standard library for the modern\nera with a focus on async + web programming. Sounds like a dream. Oh,\nand fast compile times and cross-compilation. Woah. Relatively small\nnative binaries that don\u2019t rely on libc? Doable in Go.

Lots of big projects have been done in go, like most hashicorp stuff,\ndocker, kubernetes, and a wealth of devops/cloud tools. It\u2019s a\nproductive language, and one that nudges you to sane defaults.

But it\u2019s not all sunshine and roses \u2013 the package managing story has\nbeen a nightmare, there are no generics (Hello casting to Interface{})\nand I don\u2019t understand why a GC\u2019ed language should have pointers and\nreferences explicitly? You get a lot for free, just by using Go, but you\npay for it with complexity \u2013 I rarely miss generics as an application\ndeveloper, until I\u2019m slapped with complexity because the library\ndeveloper decided to hand over the complexity to me. Go also has one\nstandard implementation and stewardship led by Google, which makes it a\nbit odd \u2013 Google has never been one for backwards compatibility, and it\nseems like Go might be due for a Go 2.0, which could be an ecosystem and\nbinary break for Go. Only time will tell if Go can survive the blow, or\nif it\u2019ll go with the route of holding onto the choices of the past.

C++

C++ is the first language on this list with no Garbage Collection, it\nhas a specification that\u2019s ISO standardized, with many committees, and\nwith many implementations. It has functionality for OOP, Functional\nProgramming, Generics, async, with the promise of being as fast as C\nwhile staying easier to use.

It mostly capitalizes on that promise. With the advent of modern C++,\neven though C++ has added many features, it has had a strong promise\ntowards backwards compatibility (it keeps ABI compatibility for a long\ntime, only recently breaking ABI in C++11), and only removing clearly\nbroken functionality (auto_ptr, anyone?) But it can be hard to use \u2013\n(the iterator API is one frustrating example), and it can be hard to see\nthe runtime cost of the abstractions you use \u2013 Even though C++ follows\nthe \u201cZero-Cost Abstractions\u201d principle, where you don\u2019t pay for what you\ndon\u2019t use, and what you do use you couldn\u2019t hand-code any better, it\nbreaks down somewhat \u2013 std::map is extremely slow on\ncertain workloads, because it\u2019s just implemented incorrectly \u2013 and\nIterators are a good example of an API footgun (remember to always check\nfor .end()!).

The complexity is never really ever paid off \u2013 you have to litter\nyour code with extra keywords like const to the left and right of your\nfunctions, along with noexcept, and final, and override. You have to\nremember what is and what isn\u2019t virtual, and use keywords accordingly,\nand you have to always generate move constructors, copy constructors,\nand remember which one is which \u2013 why are there so many ways to\ninitialize an object, and why are there so many things to remember when\nyou write your own class?

Oh, and what\u2019s the difference between struct and class? Who\nknows?

C++ is a language with lots of promises but it has run into the\nlimits of its promises \u2013 backwards compatibility, ease of use,\nperformance, and expressiveness are all in tension, and C++ is the\nlanguage you can see that in the most.

C

Meanwhile, C is much more minimalistic than C++. You get nothing \u2013 no\nexpanding arrays, no hashmaps, no trees, no graphs, no async, no\nunicode, nothing.

It\u2019s a very bare language. That helps it with portability (C is the\nmost portable language on this list by far) but it pays that price by\ndoing almost nothing for you.

It\u2019s a high-level language that makes few choices for you and leaves\nyou in a sandbox of your own creation. That can make code-sharing hard,\nsince it\u2019s bound to its environment \u2013 if you want to make a cross\nplatform library, you have to be careful about which libraries you use,\nsince different OSes have different system libraries.

It has a static type system and is speedy, for sure \u2013 but it can be\nclunky and unsafe as well.

Rust Every Day (For 3 years)

With all that being said, I\u2019ve decided to pick Rust as my language of\nchoice for the next 3 years for as many programming related tasks as I\ncan. Rust is pleasant to develop for, has pledged backwards\ncompatibility since 2015, targets a wide variety of architectures\n(thanks mainly to LLVM) and I\u2019m convinced is a language for the rest of\nmy career; it has a great team working on it, with a unique governance\nstructure that makes it resilient to being steered by one interest\ngroup.

It\u2019s taken some great ideas from functional programming (Tagged\nUnions, Sum Types, iterators) while keeping the runtime promises of more\nimperative languages. It\u2019s a great language to learn for the future, and\none that I\u2019m sure will keep on growing, and for that, I\u2019m throwing my\nweight behind it.

Rust every day. For 3 years. Then I\u2019ll revisit this and see what\u2019s\nchanged, but I\u2019d like to use Rust for the next 10 years, at least.

Implementing Iterators

2021-09-18T21:41:43-05:00

\n\n

Let\u2019s talk about implementing iterators: a way to visit every item in\na collection. We\u2019ll use C as an implementation language because it\u2019s\nsimpler than other languages, and we\u2019ll implement C++\u2019s iterator API.\nThis is the same in most mainstream programming languages, like Rust,\nC++, Python, Ruby, JavaScript, Java, C#, and PHP, with a few small\nimplementation differences.

The API

The API we\u2019ll create is simple. A int* next(int* it)\nfunction that takes an iterator and returns its next element, or a\nNULL pointer if nothing comes next, and a\nbool has_next(int* it) that returns true if it\nhas a next item, or false if it does not.

The C++ iterator API needs a few functions that give you an iterator\nto a collection. These are called begin() and\nend(), which return a pointer to the first item in the\ncollection, and one past the end of the collection. This is a dangerous\nAPI, since if we dereference end() we automatically cause\nUndefined Behavior, but our APIs become a bit cleaner. Tradeoffs, I\nguess.

We\u2019ll elide the details of begin in our example and implement it\nourselves.

Let\u2019s start by defining our collection: an array of ints from 1 -\n5.

int items[] = {1, 2, 3, 4, 5};

Let\u2019s say we want to print them: no need for iterators, of\ncourse.

#include <stdio.h>\n\nint items[] = {1, 2, 3, 4, 5};\n\nint main(void) {\n    for (int i = 0; i < 5; i++) \n        printf("%d ", items[i]);  \n}

But we have to initialize and increment a variable and use it as an\nindex to our collection\u2026 We want a clearer way of expressing a loop\nthrough all the items in a collection, and that\u2019s where iterators come\nin.

Let\u2019s start by defining the begin and end iterators.

int* begin = &items[0];\nint* end = &items[5];

Remember, the begin points to the first item of the collection, and\nend points to one past the end. We can\u2019t dereference end, so keep that\nin mind.

Now, to create the next() function, we want to take an\niterator and move to the next item if we aren\u2019t already at the end\niterator.

Let\u2019s do that:

int* next(int* it) {\n if (it != end) \n return it + sizeof(int);\n return NULL;\n}

Since we know that our iterator is an (int) pointer, we want to\nincrement the iterator four bytes (the result of sizeof(int) on my\ncomputer). This works, but there\u2019s a shorthand that most C compilers\nwill let you do, called pointer arithmetic. In this case, the compiler\nknows that this is an int pointer, and so it\u2019s overloaded additions and\nsubtractions to move forward and backwards by the sizeof an int.

We can rewrite the above as:

int* next(int* it) {\n if (it != end) \n return ++it;\n return NULL;\n}

Next, we want to write has_next. has_next\nshould return a bool true if the iterator can\nbe incremented, or false if not. We know that an iterator\nhas a next item if it\u2019s not in the last item in the collection, which is\njust before the end pointer. Thus, we can define has_next\nthusly:

int has_next(int* it) {\n return it != end - 1;\n}

Let\u2019s use our iterators thus far to traverse our collection:

#include <stdio.h>\n\nint items[] = {1, 2, 3, 4, 5};\n\nint* begin = &items[0];\nint* end = &items[5];\n\nint* next(int* it) {\n    if (it != end) \n        return ++it;\n    return NULL;\n}\n\nint has_next(int* it) {\n    return it != end - 1;\n}\n\nint main(void) {\n    puts("Printing forwards");\n    int* it = begin;\n    while (it != end) {\n        printf("%d has next? ", *it);\n        puts(has_next(it) ? "true" : "false");\n        it = next(it);\n    }\n}

This should print out:

Printing forwards\n1 has next? true\n2 has next? true\n3 has next? true\n4 has next? true\n5 has next? false

Why use Iterators?

If this seems like a lot of ceremony for iterating through an array,\nit is. It\u2019s totally unnecessary. It gives us nothing more powerful than\nwhat a raw for loop would give us. But what happens if our collection\nisn\u2019t linear? What happens if we traverse a sorted map, or a graph?

With a for loop, we must ask the caller to understand how the data\nstructure is implemented. With an iterator, we can provide a definition\nof next, and has next, and the user can call it without knowing\nanything about the underlying collection outside of the\nfact that it is iterable.

This allows us to wrap graphs, trees, hash tables, ranges (finite and\ninfinite), and circular data structures in a friendly API for our\nusers.

As well, language features allow us to reward usage of iterators by\nmaking syntax more terse: In C++, Rust, Java, C#, Ruby, Python, and\nJavaScript, if you implement the iterable API in each language, you can\ndo something along these lines:

for (item in collection)\n  do something to item

And the language takes care of the rest. In C, we can\u2019t do that, but\nin other languages, the language gives us some reward for doing so for\nour own types, as our types get to behave like library defined\ntypes.

Next Steps

Now that we can implement iterators in C, try giving it a shot in\nyour favorite language and seeing what the iterator protocol is for it.\nIt\u2019s loads of fun, I swear.

I tried it myself in C when writing a resizable array type too:

typedef struct Vector {\n  size_t len;\n  size_t capacity;\n  int *items;\n} Vector;\n\n// Allow the user to set their own alloc/free\nstatic void *(*__vector_malloc)(size_t) = malloc;\nstatic void *(*__vector_realloc)(void *, size_t) = realloc;\nstatic void (*__vector_free)(void *) = free;\n\nvoid vector_set_alloc(void *(malloc)(size_t), void *(realloc)(void *, size_t),\n                      void (*free)(void *)) {\n  __vector_malloc = malloc;\n  __vector_realloc = realloc;\n  __vector_free = free;\n}\n\nVector *vector_new(const size_t len, ...) {\n  Vector *v = __vector_malloc(sizeof(Vector));\n  int capacity = 8;\n  capacity = max(pow(2, ceil(log(len) / log(2))), capacity);\n\n  v->items = __vector_malloc(sizeof(int) * capacity);\n  v->len = len;\n  v->capacity = capacity;\n\n  if (len > 0) {\n    va_list argp;\n    va_start(argp, len);\n\n    for (size_t i = 0; i < len; i++) {\n      v->items[i] = va_arg(argp, int);\n    }\n\n    va_end(argp);\n  }\n\n  return v;\n}\n\nvoid vector_free(Vector *v) {\n  __vector_free(v->items);\n  __vector_free(v);\n}\n\nint vector_get(Vector *v, size_t index) {\n  assert(index >= 0 && index < v->len);\n  return v->items[index];\n}\n\nvoid vector_set(Vector *v, size_t index, int val) {\n  assert(index >= 0 && index < v->len);\n  v->items[index] = val;\n}\n\nint vector_empty(Vector *v) { return v->len == 0; }\n\nvoid vector_push(Vector *v, int val) {\n  if (v->len == v->capacity) {\n    v->capacity *= 2;\n    v->items = __vector_realloc(v->items, sizeof(int) * v->capacity);\n  }\n  v->items[v->len] = val;\n  v->len++;\n}\n\nint *vector_begin(Vector *v) { return &v->items[0]; }\n\nint *vector_end(Vector *v) { return &v->items[v->len]; }\n\nint *vector_next(Vector *v, int *it) {\n  if (it != vector_end(v))\n    return ++it;\n  return NULL;\n}\n\nvoid vector_for_each(Vector *v, int (*fn)(int)) {\n  for (int i = 0; i < v->len; i++) {\n    v->items[i] = (*fn)(v->items[i]);\n  }\n}\n\nint vector_pop(Vector *v) {\n  assert(v->len > 0);\n  int top = v->items[v->len - 1];\n  v->len--;\n  return top;\n}

The Expression Problem And Operations On\nMatricies

2021-06-22T08:52:01-04:00

\n\n

I\u2019ve heard the sentiment that technical interviews focus on the wrong\nthings; technical aptitude in data structures and algorithms isn\u2019t a\ngreat measure of people\u2019s on the job performance. I agree. That being\nsaid, there are some small problems where cursory knowledge in them can\nhelp you out.

I\u2019m going to use an example I encountered recently, where I wanted to\naggregate my personal finances into monthly, quarterly, and yearly\nreports, with columns for earnings, spend, and cashflow (earnings -\nspend).

I downloaded the relevant CSVs and went to work parsing them.

Parsing

The first problem came with cleaning up some transactions that were\nunnecessary \u2013 banks tend to charge a maintenance fee, but they end up\ncrediting you if you meet certain criteria. Even though this balances\nout, I didn\u2019t want this to count in my earnings and spend, so I wrote a\nsed regex to delete these lines. Likewise, I wanted to\nremove some of my investments that I had made (I don\u2019t consider these to\nbe spending, and I wanted to track these another way). Another\nsed regex it is. Eventually this became a pain, so I made a\nbash function to combine the regexes in an array to parse the CSV. You\njust add regexes and it\u2019ll remove the related transactions. Easy\nenough.

The problem

If you visualize the problem at hand, you\u2019ll get this matrix.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

	Monthly	Quarterly	Yearly
Earnings	Monthly Earnings	Quarterly Earnings	Yearly Earnings
Spend	Monthly Spend	Quarterly Spend	Yearly Spend
Cashflow	Monthly Cashflow	Quarterly Cashflow	Yearly Cashflow

We have an (m * n) problem, where if we had a new row or column, we\nadd (m) or (n) more things we need to calculate.

Let\u2019s get solving.

Naive Approach

The Naive approach is the O(m * n) solution, where you create a\nfunction that deals with a particular cell of this matrix. Take Monthly\nEarnings. You would write a function that does the logic for dividing\nthe CSV into months, and then applying the logic for Earnings to it.

You would repeat this eight times, to get 9 different functions.

When calling this logic, you would have a switch-case that would\nselect the logic required.

This isn\u2019t very dry, and very tedious, even with only 9 cells. I\nwanted something better. This is due to the runtime of matricies.

The Expression problem

The expression\nproblem is a fundamental problem in writing functions that work on\ntypes and vice versa.

OOP Languages like Java make it easy to create new types that have\noperations on data. But let\u2019s say you want to add a new method to all\nyour classes. You now need to add a new method to all of your existing\nand future classes.

ML type languages use pattern matching, which allows you to easily\nadd new operations to a function. But to apply that operation to all\nexisting types, you have to add a new case to all of the pattern\nmatches.

Let\u2019s say I add a new time period, \u201cbi-yearly\u201d to denote half a year\nchunks. Well, if we go by the naive case, we\u2019d have to add new cases for\nbi-yearly + cashflow, bi-yearly + earnings, bi-yearly + spend. That\u2019s 3\nnew functions for one new time period. Ouch.

Let\u2019s say I want a new category that only counts purchases that are\nlarger than $100, and call these \u201clarge purchases\u201d. If so, I would have\nto add logic to count for monthly, quarterly, bi-yearly, and yearly time\nperiods. That\u2019s 4 new functions for a new category.

Let\u2019s say I want to add a new dimension. I want to split my purchases\nfor all of the above categories and time periods between my credit card\nand debit card. That\u2019s 4 * 4, or 16 new functions I\u2019d need to\nimplement.

Big O notation would say this grows linearly with regards to the size\nof each dimension.

If we have 4 monetary categories and 4 time periods, we have 4 * 4 or\n16 functions to implement. If we have 4 monetary categories, 4 time\nperiods, and 2 types of credit cards, we have 4 * 4 * 2, or 32 functions\nto implement.

Our first proposed matrix is a 2D square, and we\u2019re calculating its\narea. A square with a length of four and a height of 4 has an area of\n16.

Our second proposed matrix is a 3D square (a cube). To calculate the\narea of a cube, you multiply its length, width, and height. We have a\nlength of 4, a width of 4, and a height of 2, totalling 32.

As we add new fields to our rows and columns, and new dimensions to\nour matrix, we\u2019ll soon see that this becomes untenable. (What happens if\nI also want to count my investment accounts, of which I have many? I\u2019d\nhave to add more and more dimensions, and the amount of functions to\nimplement increases quite a lot.

Improving the Naive Approach

In the interest of clean code, I wanted to create small composable\nfunctions that would calculate the each category.

The Earnings function would only calculate transactions with a\npositive amount The Spend function would only calculate transactions\nwith a negative amount The Cashflow function would calculate all\ntransactions.

The Monthly function would divide the CSV into months, and apply a\nfunction to each range. The Quarterly function would divide the CSV into\nquarters, and apply a function to each range. The Yearly function would\ndivide the CSV into years, and apply a function to each range.

But the problem is how to set this up properly.

We need some way to signal to the main function that we want to\ncalculate a row * column pairing (a cell). If we add a new dimension, we\ndon\u2019t want to break previous code.

Using a pair

One way to signal this is to use a pair of enums that are captured in\na pair as (row, col). This works well enough if we stick to two\ndimensions. If we add a 3rd dimension, though, this will create\nincorrect code. If we\u2019re strict on requiring a pair, then we can\u2019t add\nthe 3rd dimension at all without breaking all of our existing code. If\nwe\u2019re looser (allow any tuple, and unpack the first, second and third\nvalues) this will work, but our code will be a bit confusing in order to\nmaintain backwards compatibility (some code will check the first and\nsecond fields of a 3-tuple, even though it should be checking all\nthree).

Using a flag

Another way to signal this is to use a Flag enum. A flag enum is an\nenum that has values corresponding to powers of 2.

For example:

typedef enum Color {\n RED = 1,\n GREEN = 2,\n BLUE = 4\n} Color;

(Some people prefer to write it like this, to be explicit that this\nis a flag):

typedef enum Color {\n RED = 1 << 0, // 1 \n GREEN = 1 << 1, // 2\n BLUE = 1 << 2, // 4\n} Color;

This has the nice property that we can use bitwise or to denote more\nthan one state, and bitwise and to check if the enum is one or more or a\nparticular state.

Color color = RED | GREEN; // this color is Red and Green\nColor Black = RED | GREEN | BLUE; // this color, black, is all the colors\n\nif (color & RED) {\n  // this color has red, do some logic in the red case\n}\nif (color & GREEN) {\n  // this color has green, do some logic in the green case\n}\nif (color & BLUE) {\n  // this color has blue, do some logic in the blue case\n}

In C, since enums are stored as an unsigned int, you can store up to\n32 fields (rows + columns + dimensions, in our case). Sometimes this\nisn\u2019t enough, and you\u2019ll have to find another way, but in our case, it\nworks fine.

The final implementation

Finally to solve the problem, we want to provide a flag enum, and get\nthe CSV that we\u2019re applying it to. First, we want to slice the CSV into\nthe range provided, and then apply some category (Earning, Spend,\nCashflow) to it, and then save the CSV.

That can be done something like this:

typedef enum Categories {\n  MONTHLY = 1 << 0,\n  QUARTERLY = 1 << 1,\n  YEARLY = 1 << 2,\n  EARNINGS = 1 << 3,\n  SPEND = 1 << 4,\n  CASHFLOW = 1 << 5,\n} Categories;\n\nvoid generateCsvs(Category category) {\n  if (category & MONTHLY) {\n    doMonthlyLogic();\n  }\n  if (category & QUARTERLY) {\n    doQuarterlyLogic();\n  }\n  if (category & YEARLY) {\n    doYearlyLogic();\n  }\n  if (category & EARNINGS) {\n    doEarningsLogic();\n  }\n  if (category & SPEND) {\n    doSpendLogic();\n  }\n  if (category & CASHFLOW) {\n    doCashflowLogic();\n  }\n}

We can generate our CSVs just like that. Nice. If we add a new\ndimension, like Credit card vs debit card, all we have to do is add it\nto our enum and our main function. This only adds two new enums and two\nnew cases. We\u2019ve gone from adding (m * n) functions for our logic to\njust (m + n). Big O strikes again.

typedef enum Categories {\n  MONTHLY = 1 << 0,\n  QUARTERLY = 1 << 1,\n  YEARLY = 1 << 2,\n  EARNINGS = 1 << 3,\n  SPEND = 1 << 4,\n  CASHFLOW = 1 << 5,\n  CREDIT = 1 << 6,\n  DEBIT = 1 << 7,\n} Categories;\n\nvoid generateCsvs(Categories category) {\n  if (category & CREDIT) {\n    doCreditLogic();\n  }\n  if (category & DEBIT) {\n    doDebitLogic();\n  }\n  if (category & MONTHLY) {\n    doMonthlyLogic();\n  }\n  if (category & QUARTERLY) {\n    doQuarterlyLogic();\n  }\n  if (category & YEARLY) {\n    doYearlyLogic();\n  }\n  if (category & EARNINGS) {\n    doEarningsLogic();\n  }\n  if (category & SPEND) {\n    doSpendLogic();\n  }\n  if (category & CASHFLOW) {\n    doCashflowLogic();\n  }\n}

If we wanted more than 31 categories, we could still do that using a\nstruct instead of an enum. This creates a struct where every member is a\nboolean flag, and we\u2019re checking if it\u2019s set in our generateCsvs\ncode.

typedef struct Categories {\n  int MONTHLY : 1;\n  int QUARTERLY : 1; \n  int YEARLY : 1; \n  int EARNINGS : 1;\n  int SPEND : 1; \n  int CASHFLOW : 1; \n  int CREDIT : 1; \n  int DEBIT : 1; \n} Categories;\n\nCategories category = { 1, 0, 0, 1 }; // MONTHLY and EARNINGS are set, everything else is zero-initialized.\n\n// or this:\nCategories category = {};\ncategory.MONTHLY = 1; // set MONTHLY;\ncategory.EARNINGS = 1; // set EARNINGS; \n\nvoid generateCsvs(Categories category) {\n  if (category.MONTHLY) {\n    doMonthlyLogic();\n  }\n  // etc.\n}

A note on Associativity

But wait, there\u2019s something we can improve upon in our solution:

You might\u2019ve noticed that we\u2019re coupling our code based on time.\nSince we\u2019ve decided to cut up the CSV first by time and then calculate\nthe monetary category, you\u2019ve noticed that if we flip the order of them,\nwe might get a different result. This is bad, because refactoring tends\nto reorder things, and code that is coupled throughout time tends to\nlead to messier code.

To improve this, we need to add a few restrictions.

But first, a review on associativity and composition.

Associativity means that the order a function is applied in doesn\u2019t\nmatter.

Let\u2019s take the multiplication function. You\u2019ll notice that we can\napply them in any order and the function is still correct.

\n
4 * 3 * 2 == (4 * 3) * 2 == 4 * (3 * 2)
\n

Whereas division is not associative, because:

\n
(12 / 2) / 3 != 12 / (2 / 3).
\n

What we did above was like division, where we must apply the\nfunctions in some order, so they are coupled in time (the parentheses\ndenote this). What we really want is a mulitiplicative (associative)\nfunction, because no matter how many changes we make to the code, it\nwill only grow in complexity linearly, not polynomially.

Thus, if we guarantee that our operations are associative, then we\ndon\u2019t have to worry about how we lay out our main function at all.

To do this, we\u2019ll have to write our functions in a way that they take\na CSV and return a CSV after doing some to work them. These CSVs must\nwork on every other CSV that any other step in the main function can\nproduce. So, we\u2019ll change our main function to be like this, where every\nfunction takes the CSV and returns a CSV.

We\u2019ll use the flag enum to make sure that we\u2019re applying just the\nfunctions that we want.

void generateCsvs(Categories category) {\n  csv = {}; \n  // assume the CSV is an array\n  if (category & CREDIT) {\n    csv = doCreditLogic(csv);\n  }\n  if (category & DEBIT) {\n    csv = doDebitLogic(csv);\n  }\n  // etc\n  writeToCsv(csv);\n}\n// this is the same function \nvoid generateCsvs(Categories category) {\n  csv = {}; \n  // We've flipped the order, but it still works \n  if (category & DEBIT) {\n    csv = doDebitLogic(csv);\n  }\n  if (category & CREDIT) {\n    csv = doCreditLogic(csv);\n  }\n  // etc\n  writeToCsv(csv);\n}

Conclusion

We\u2019ve seen how we can use flag enums, combined with some logic, to\ncut down the amount of functions we have to write in order to calculate\nthe cell of a matrix. While I won\u2019t agree big tech interviews are the\nbest way to assess candidates, sometimes these problems crop up, and\npeople have been grappling with them for a long time (like the\nexpression problem).

Write RFCs

2021-06-03T21:00:10-04:00

\n\n

RFCs are Requests for Comments, popularized by the\nInternet Engineering Task Force (IETF) which develops and promotes\nstandards for the internet.

Defining an RFC

Much has been said about RFCs (Including RFC 3 which\noutlines how to write an RFC for the IETF), but let\u2019s read through the\nmain points of RFC 3.

RFCs can be on thoughts, suggestions, etc. relating to the subject\n(The internet in this case)
RFCs should be timely, rather than polished
RFCs do not require examples
RFCs can be as short or as long as needed

According to RFC 3, RFCs should have the following information:

\u201cNetwork Working Group Request for Comments:\u201d X (where X is the\nnumber of the RFC)
Author and Affiliation
Date
Title (Does not need to be unique)

Benefits of RFCs

According to 6\nLessons I learned while implementing technical RFCs as a decision making\ntool, After implementing RFCs in his organization, Juan Pablo\nBuritic\u00e1 picked RFCs for the following reasons:

enable individual contributors to make decisions for systems they\u2019re\nresponsible for
allow domain experts to have input in decisions when they\u2019re not\ndirectly involved in building a particular system
manage the risk of decisions made
include team members without it becoming design by committee
have a snapshot of context for the future
be asynchronous
work on multiple projects in parallel

In my company, RFCs allow us to propose ideas to improve processes,\ndevelopment experience, the product, with low risk of retribution and\nwithout the anxiety of giving a presentation. Q & A is relaxed for\nboth the author of the RFC and those writing comments, as they are\nallowed to proceed asynchronously, without either party feeling\npressured to have all the answers right away. They\u2019re also a written\nrecord of the thoughts of the authors and reviewers throughout the\nlifecycle of the proposal, and serve as a historical artifact for\nreflection (if a proposal that sounded good didn\u2019t turn out so well, why\ndidn\u2019t it work out?)

Implementing RFCs

Oxide Computer explained how they do RFCs (which they call RFDs,\nRequests for Discussions) here: RFDs at\nOxide Computer

At Oxide, RFDs are appropriate for the following cases:

Add or change a company process
An architectural or design decision for hardware or software
Change to an API or command-line tool used by customers
Change to an internal API or tool
Change to an internal process
A design for testing

Oxide has a few twists, like adding a state as metadata (an RFD can\nbe in the Prediscussion, Ideation, Discussion, Published, Committed, or\nAbandoned state), and go into detail about integrating their RFD system\ninto git.

A Template for RFCs

Following what Oxide did, I made a template repository for RFCs.

You can find it here: Template RFC\nRepository

Learning Recursion

2021-06-03T11:30:55-04:00

\n\n

It\u2019s been said that the only way to learn recursion is to learn\nrecursion. So let\u2019s get started!

Recursion is defined by the repeated application of a procedure.\nThere are three distinct parts to creating a recursive function:

A terminating base case
Continuing the recursion
Making progress towards the base case

Let\u2019s look at all of them while applying them to a problem:

\n
Given an array of integers, return the sum of their values.
\n

A terminating base case

We need a terminating base case, because otherwise a recursive\nfunction will continue forever.

We\u2019ll start backwards (trying to find the case that terminates the\nalgorithm) and work our way from there.

Let\u2019s say we have an empty array: well, that would look like this: If\nthe array is empty, then it makes sense that its sum is 0.

Let\u2019s start writing some code to express that:

int sum_empty_arr(int arr*, size_t len) {\n  return 0;\n}\n\nint sum(int arr*, size_t len) {\n  if (len == 0) {\n    return sum_empty_arr(arr, len);\n  } else {\n    // In the next section!\n  }\n}

And hey, that\u2019s the only base case.

Continuing the Recursion

To continue the recursion, let\u2019s continue thinking: if we have a one\nitem array, what do we do?

A one item array\u2019s sum can be expressed like this:

arr[0] + 0.

Let\u2019s take {1, 2} as our array. Well, the sum of the\narray {1, 2} can be expressed like this:

arr[0] + sum({2})

Similarly, if we have a 3 item array like {1, 2, 3}, the\nsum of the array can be expressed like this:

arr[0] + sum({2, 3})

This formula works on arrays with any length.

Our key insight here is to take the first item of the array, and sum\nit with the result of the sum of the rest of the items in the array.\nWe\u2019ve found a way to continue the recursion.

Next, we\u2019ll have to think about how to make progress towards the base\ncase.

Making progress towards the base case

In our formula, we\u2019ve found a way to continue the recursion. But are\nwe making progress towards the base case?

Our base case will terminate when it is provided an array with a\nlength of 0.

In every recursive call, we reduce the length of the array we provide\nto sum by 1. Thus, as long as our array length is positive\n(which we can safely assume), we\u2019ll make progress towards the base case.\nNice! We won\u2019t recurse forever.

Implementation

We can turn our idea into code like so (I\u2019m doing a bit of pointer\narithmetic to move the pointer past the first item and decrementing the\nlength before calling sum again).

int sum(int *arr, size_t len) {\n    if (len == 0) {\n        return sum_empty_arr(arr, len);\n    }  else {\n        int head = arr[0];\n        len--; // decrement the length\n        arr++; // move past the first item\n        return head + sum(arr, len);\n    }\n}

And hey, if we run it:

int main(void) {\n int arr[] = {1,2,3,4};\n int total = sum(arr, 4);\n\n printf("%d\\n", total); // 10\n}

And we get the correct result.

We can clean this code up a bit:

int sum(int *arr, size_t len) {\n    if (len == 0) {\n        return 0;\n    }  else {\n        return arr[0] + sum(++arr, --len);\n    }\n}

Interestingly enough, even though python has a slicing operator, the\nimplementation is similar in length:

def list_sum(arr):\n if len(arr) == 0:\n return 0\n else:\n return arr[0] + list_sum(arr[1:])

Similarly, in a language like OCaml, where recursion is the idiomatic\nway to express algorithms:

let rec sum = function\n  | [] -> 0 (* if the array is empty, return 0 *)\n  | h::t -> h + (sum t) (* otherwise, return the value of the head + sum of the rest of the elements. *)

Let\u2019s try another problem:

\n
Given a binary tree, calculate the sum of the values of all nodes in\nthe binary tree.
\n

Let\u2019s go through the steps again.

Finding a base case

Let\u2019s ask ourselves what the possible cases are:

If there is no node, because the node is null, clearly it shouldn\u2019t\ncount. Much like in the empty array case, let\u2019s return 0.

If there is a node, let\u2019s return its value.

Great, we\u2019ve got all our base cases covered. Let\u2019s express them\nbefore we continue on:

int sum(TreeNode *node) {\n if (node == NULL)\n return 0;\n else\n return node->val;\n}

But how do we continue the recursion?

Continuing the Recursion

To continue the recursion, we can apply the function we\u2019ve created to\nits left and right node. But how? Well, thinking back to the previous\nproblem, the sum of an array is the sum of the current value (the head)\n+ the rest of the items in the array. Likewise, for a binary tree, we\nneed to find the sum of the left items and the sum of the right\nitems.

Since we know that a null node can\u2019t point to anything, we can leave\nthat case be, and express the sum of the binary tree as its current\nvalue + the sum of its left child + the sum of its right child.

Let\u2019s do that:

int sum(TreeNode *node) {\n  if (node == NULL)\n    return 0;\n  else\n    return node->val + sum(node->left) + sum(node->right);\n}

Making Progress

Are we making progress? We must be: for every node, we move onto its\nchild nodes. Child nodes (hopefully) eventually return null, in the case\nof a finite binary tree (of course, we can\u2019t calculate the sum of an\ninfinitely large binary tree).

We did it! We can take this same idea and apply it to linked lists\nand graphs as well. That\u2019ll be an exercise for the reader, but the idea\nis very similar.

Appendix

Full code to sum of the nodes of a binary tree:

In C:

#include <stdio.h>\n\ntypedef struct TreeNode {\n  int val;\n  struct TreeNode *left;\n  struct TreeNode *right;\n} TreeNode;\n\nint sum(TreeNode *node) {\n  if (node == NULL)\n    return 0;\n  else\n    return node->val + sum(node->left) + solve(node->right);\n}

In Java:

class Solution {\n  record TreeNode(int val, TreeNode left, TreeNode right) {}\n\n  public int sum(TreeNode node) {\n    if (node == null) {\n      return 0;\n    } else {\n      return node.val + sum(node.left) + sum(node.right);\n    }\n  }\n}

In OCaml:

type 'a tree =\n  | Node of 'a tree * 'a * 'a tree\n  | Leaf;;\n\nlet rec fold_tree f a t =\n    match t with\n      | Leaf -> a\n      | Node (l, x, r) -> f x (fold_tree f a l) (fold_tree f a r);;

Unix Environment Variables

2021-05-24T18:58:40-04:00

Let\u2019s talk about some popular unix environment variables:

$USER - The current user
$PAGER - the program that accepts page by page input.\nless and more are good examples.
$VISUAL - A full screen editor (like vi,\nemacs, and nano).
$EDITOR - A line by line editor (ed or\nex work).
$PWD - the current working directory
$HOME - the home directory
$LANG - the language you use, with an optional\nencoding.
$MANPATH - the list of directories to search for manual\npages.
$MAIL - where mail goes
$SHELL - path to shell binary you use\n(e.g.\u00a0/bin/bash, /bin/ksh,\n/bin/sh, /bin/zsh)

The most important one is probably $PATH, which is where\nthe OS looks for binaries. It goes from the beginning to the end,\nexecuting the first binary it finds.

Let\u2019s say my $PATH is like this:

/usr/local/bin:/usr/bin

Which instructs the OS to look into /usr/local/bin to\nfind a valid binary. Then /usr/bin.

The /bin directory contains binaries for sysadmins and\nusers, but are required when there\u2019s no filesystem in use.

The /usr/bin and was meant to contain executable\nprograms that were part of the OS

and /usr/local/bin is for software that the user\ninstalls.

There directories where superuser binaries should be located which\nfollow the same scheme:

/sbin
/usr/sbin
/usr/local/sbin

As well, /usr/share/bin is for binaries used for web\nservers and clients.

If you find that a command doesn\u2019t work, double-check to make sure\nthat your $PATH is set up properly to find the correct\nbinary.

The Central Limit Theorem

2021-05-19T19:55:46-04:00

The central limit theory (CLT) states that when\nindependent random variables are added, their properly\nnormalized sum tends toward a normal distribution (a bell curve) even if\nthe original variables themselves are not normally distributed. ~\nWikipedia

One constraint is that we must have finite variance, as that lowers\nthe amount of outliers we see.

Let\u2019s roll a dice 1000 times and see what that gets us:

from collections import Counter\nimport random\n\ndef r():\n    return random.randrange(1, 7)\n\n\nrolls = Counter()\n\nfor _ in range(1000):\n    rolls[r()] += 1\n\nsorted_dict = {k: rolls[k] for k in sorted(rolls)}

Running it I get this:

{1: 156, 2: 168, 3: 192, 4: 143, 5: 170, 6: 171}

1 was rolled 156 times
2 was rolled 168 times
3 was rolled 192 times
4 was rolled 143 times
5 was rolled 170 times
6 was rolled 171 times

You\u2019ll see that there\u2019s some variance, but we get a good enough\nresult.

To test the central limit theorem, let\u2019s try to roll two dice at the\nsame time 1000 times and plot it.

Change this line to this:

for _ in range(1000):\n rolls[r() + r()] += 1

Here were the rolls:

{2: 22, 3: 59, 4: 81, 5: 110, 6: 149, 7: 174, 8: 144, 9: 101, 10: 77, 11: 53, 12: 30}

And here\u2019s the distribution:

You\u2019ll notice it\u2019s starting to converge on 6 and 7, but 2 and 12 were\nfairly unlikely.

We\u2019re starting to get a normal distribution!

Let\u2019s do 5 dice rolls.

Change the line below:

for _ in range(1000):\n rolls[r() + r() + r() + r() + r()] += 1

Here\u2019s the outcome:

{7: 2, 8: 8, 9: 8, 10: 14, 11: 33, 12: 35, 13: 54, 14: 72, 15: 61, 16: 85, 17: 94, 18: 93, 19: 108, 20: 88, 21: 82, 22: 65, 23: 37, 24: 22, 25: 17, 26: 11, 27: 8, 28: 2, 29: 1}

And the five dice rolls.

We get closer to a normal distribution.

Let\u2019s do it a million times:

for _ in range(1000000):\n rolls[r() + r() + r() + r() + r()] += 1

Here\u2019s the outcome:

{5: 147, 6: 600, 7: 1890, 8: 4469, 9: 9147, 10: 16052, 11: 26310, 12: 39423, 13: 54070, 14: 69536, 15: 83228, 16: 94310, 17: 99997, 18: 100488, 19: 94484, 20: 83745, 21: 69777, 22: 53880, 23: 39506, 24: 26489, 25: 16144, 26: 9076, 27: 4547, 28: 1900, 29: 649, 30: 136}

And hey look, that looks like a normal distribution to me!

File Io

2021-05-19T16:41:02-04:00

\n\n

File I/O is slow: but how slow is it really? Are there any ways we\ncan make it faster? Let\u2019s find out!

First let\u2019s start out by writing the character a to a\nfile in python:

import timeit\n\n\ndef test():\n    with open(f'output.txt', 'w+') as f:\n        f.write('a')\n\n\nif __name__ == "__main__":\n    print(\n        f'This took {timeit.timeit("test()", globals=locals(), number=1)} seconds.')

On my machine, this prints out:

This took 0.0002999180000000032 seconds.

Makes sense. Since it\u2019s hard to look at such small numbers, let\u2019s\nbump our number of repetitions up to 10000.

import timeit\n\n\ndef test():\n    with open(f'output.txt', 'w+') as f:\n        f.write('a')\n\n\nif __name__ == "__main__":\n    print(\n        f'This took {timeit.timeit("test()", globals=locals(), number=10000)} seconds.')

On my machine, this prints out:

This took 4.582478715000001 seconds.

Let\u2019s try something similar but in memory. Let\u2019s add the string\na to an empty string and return it:

import timeit\n\n\ndef test():\n    s = ''\n    s += 'a'\n    return s\n\n\nif __name__ == "__main__":\n    print(\n        f'This took {timeit.timeit("test()", globals=locals(), number=10000)} seconds.')

On my machine, this prints out:

This took 0.0009243260000000031 seconds.

Doing some math, writing to a file 10000 times is 5000x slower than\nwriting to a string 10000 times in memory.

So our intuition (and our Operating Systems textbooks) are correct.\nLet\u2019s dig deeper to see if we can find anything else.

Intuition

Since all we\u2019re doing is opening a file, writing to it, closing the\nfile 10000 times, maybe there\u2019s some way to speed up this operation.

Let\u2019s build a mental model for how python writes to a file:

Open output.txt.
Write the character a to output.txt.
Close the file.

Suggestion 1:

Since we\u2019re opening and closing the same file, what if we had some\nabstraction that represented the file? Let\u2019s say we had some integer\nthat would represent the file (a file descriptor) and we kept track of\nits state inside of our program. Whenever we need to save our changes to\ndisk, we notify the OS.

So instead of doing:

repeat 10000 times:\n  open `output.txt`\n  clear the contents of `output.txt`\n  write `a` to output.txt\n  close `output.txt`

Which would require us to open the same file 10000 times:

We try this:

file_contents = {}\nfile_contents['output.txt'] = 'a'\nopen file\nclear the contents of `output.txt`\nwrite file_contents['output.txt'] to `output.txt`\nclose `output.txt`

Which would only require 1 call to the OS to open the file, 1 call to\nthe OS to write to the file, and 1 call to the OS to close the file.

Python does this to some degree out of the box: the interpreter keeps\na dictionary of file_descriptor -> changes and when it deems\nnecessary, it gives the file changes to the OS.

To make python commit its buffer to the OS, use the\nflush() function.

Suggestion 2:

What if the OS had a cache too? Since there are many processes trying\nto access the OS\u2019 resources, the OS has a chance to reconcile file\nwrites and batch them in a way that is more efficient.

Let\u2019s say we ran the same python program twice at exactly the same\ntime. If we only employed caching at the python level, we\u2019d have to\nwrite to the same file twice with the character a. Of\ncourse, the OS can reconcile those changes and make it so there\u2019s only 1\nopen-write-close cycle required.

It turns out both of these suggestions are implemented.

To force the OS to propagate a change, you can use the\nos.fsync(f.fileno()) function. When called, python asks the\nOS persist the changes in file descriptor f to disk.

Const Correctness

2021-05-17T19:32:39-04:00

Const correctness means marking all items you can const\nto prevent unwanted mutation. Let\u2019s say you want to grab a few options\nfrom a settings map that you\u2019ve created.

Let\u2019s say you want the time to live value and the created at from a\nmap.

Does this compile?

void setTimeToLive(int ttl) {/* implementation here */}\nvoid setCreatedAt(int createdAt) {/* implementation here */}\n\nvoid getOptions(const std::map<const char*, int> &m) noexcept {\n  const auto ttl = m["ttl"];\n  const auto createdAt = m["createdAt"];\n  setTimeToLive(ttl);\n  setCreatedAt(createdAt);\n}

Nope: since map[] will insert in the case that it\ndoesn\u2019t find a matching key, this doesn\u2019t compile. We can\u2019t insert into\na map marked const.

Let\u2019s say we didn\u2019t mark the map as const:

void setTimeToLive(int ttl) {/* implementation here */}\nvoid setCreatedAt(int createdAt) {/* implementation here */}\n\nvoid getOptions(std::map<const char*, int> &m) noexcept {\n  const auto ttl = m["tttl"]; // oops, typo\n  const auto createdAt = m["createdAt"];\n  setTimeToLive(ttl);\n  setCreatedAt(createdAt);\n}

This compiles now, but oh no, what\u2019s the value of ttl?\n0. When we access a map\u2019s key that doesn\u2019t exist, we get\nsome default value. In our case, a value of 0.

So we\u2019re at a crossroads. We want our code to compile, be correct,\nand still allow the map to be const.

Let\u2019s let that happen:

void setTimeToLive(int ttl) {/* implementation here */}\nvoid setCreatedAt(int createdAt) {/* implementation here */}\n\nvoid getOptions(const std::map<const char*, int> &m) {\n  const auto ttl = m.at("tttl"); // oops, typo\n  const auto createdAt = m.at("createdAt");\n  setTimeToLive(ttl);\n  setCreatedAt(createdAt);\n}

We replace map::[] with map::at.\nmap::at does a checked get in the map. If it doesn\u2019t find\nthe key, it throws an exception.

We remove our noexcept from the function because this\nfunction can throw and we move on with our lives.

Const correctness saves lives.

Virtual Functions

2021-05-16T21:28:00-04:00

While learning Java, I came across this line of code:

List<Integer> array = new ArrayList();

As someone who knows C++, this is somewhat confusing \u2013\nList is an interface that ArrayList\nimplements. But if we treat ArrayList as a list, then when\nwe call List.add() we must use dynamic dispatch to find the\nright implementation, since the implementation of add isn\u2019t\non the list interface.

That involves a virtual function lookup.

In Java, most function calls are virtual, unless a\nmethod is declared as final (which means it cannot be\noverriden).

By doing this, we get an advantage \u2013 if we decide to change our\narray variable from an ArrayList to another\nList interface conforming type, we can do so without\nbreaking any code.

In exchange, we cannot use any ArrayList specific\nmethods without casting to ArrayList, which nullifies this\nbenefit.

We have information that both ArrayList and LinkedList implement the\nList interface, so we can treat them as such in a collection.

ArrayList<List> lists = new ArrayList<List>();\nlists.add(new ArrayList());\nlists.add(new LinkedList());\n\nfor (List<Integer> l : lists) {\n    l.add(10); // defer to each lists' implementation for add\n}

Since we know that Java uses virtual functions for interface\nimplementations, everything works as expected. l.add(10)\ndefers to the implementation of ArrayList and\nLinkedList, and each list is added a 10.

A full example of virtual functions might look something like\nthis:

public class Animal {\n  void move() {\n    System.out.printf("walk %s\\n", this.name());\n  }\n\n  String name() {\n    return "Animal";\n  }\n\n  public static void main(String[] args) {\n    Animal animals[] = {new Animal(), new Bird(), new Elephant()};\n\n    for (Animal animal : animals) {\n      animal.move();\n    }\n  }\n}\n\nclass Bird extends Animal {\n  @Override\n  void move() {\n    System.out.printf("fly %s\\n", this.name());\n  }\n\n  @Override\n  String name() {\n    return "Bird";\n  }\n}\n\nclass Elephant extends Animal {\n  @Override\n  String name() {\n    return "Elephant";\n  }\n}

Running this prints this out:

walk Animal\nfly Bird\nwalk Elephant

Here we create a base class, Animal, which has two\nmethods, name and move. This class is\ninstantiable because it has default implementations for both methods.\nThe Bird class overrides the move method, since it flies instead of\nwalks, and the Elephant only overrides the name. When we collect them\ninto an array and call the move() method on each animal as\nan animal, Java does the virtual function lookup and calls the correct\noverriden method as we expect.

It turns out that not every language does this, mainly because there\nis a runtime cost in dynamic dispatch.

In C++, you must designate a function as virtual which\nlabels a function as overridable by an inherited class.

A roughly word for word translation of the above java program would\nlook like this:

#include <stdio.h>\n\nstruct Animal {\n  virtual const char *name() const { return "Animal"; }\n  virtual void move() const { printf("walk %s\\n", name()); }\n  virtual ~Animal() {}\n};\n\nstruct Bird : public Animal {\n  virtual void move() const override { printf("fly %s\\n", name()); }\n  virtual const char *name() const override { return "Bird"; }\n};\n\nstruct Elephant : public Animal {\n  virtual const char *name() const override { return "Elephant"; }\n};\n\nint main(void) {\n  const Animal *animals[] = {new Animal(), new Bird(), new Elephant()};\n\n  for (const auto &animal : animals) {\n    animal->move();\n  }\n}

This returns:

walk Animal\nfly Bird\nwalk Elephant

In the Java version it isn\u2019t apparent to the programmer that there is\na runtime cost, but C++ puts it front and center. Since we used the\nnew keyword, everything is placed on the heap. When we try\nto call the move() method, we have to do it with the\n-> operator, which calls an object on the heap, rather\nthan on the stack.

Let\u2019s say we don\u2019t use the new operator, and don\u2019t use a\npointer lookup:

int main(void) {\n  const Animal animals[] = {Animal(), Bird(), Elephant()};\n\n  for (const auto &animal : animals) {\n    animal.move();\n  }\n}

This prints out:

walk Animal\nwalk Animal\nwalk Animal

Since we are literally treating each animal as an Animal\ntype, animal.move() calls the default Animal\nimplementation of move. If we choose to have no runtime cost, we get\nless desirable behavior. But C++ gives you the choice upfront.

Let\u2019s dig deeper.

C doesn\u2019t have any language support for virtual\nfunctions, but we can still emulate them.

A (very) rough translation of the java program might look like\nthis:

#include <stdio.h>\n\ntypedef struct Animal {\n  const char *name;\n  void (*move)(const struct Animal *);\n} Animal_t;\n\nvoid fly(const Animal_t *a) { printf("fly %s\\n", a->name); }\nvoid walk(const Animal_t *a) { printf("walk %s\\n", a->name); }\nvoid animal_move(const Animal_t *a) { a->move ? a->move(a) : walk(a); }\n\nint main(void) {\n  const Animal_t animals[] = {\n      {.name = "Animal"}, {.name = "Bird", .move = fly}, {.name = "Elephant"}};\n  const size_t animals_len = sizeof(animals) / sizeof(Animal_t);\n\n  for (int i = 0; i < animals_len; i++) {\n    const Animal_t animal = animals[i];\n    animal_move(&animal);\n  }\n}

C actually lets us do some interesting things here: It allows us to\nallocate our animals on the stack. As well, we create an\nanimal_move function that asks the struct passed in if it\nhas a function pointer for move. If it does, then it defers\nto that, otherwise it calls a default implementation. If we do choose to\nuse a more specific version of move, then we do have the\npointer lookup cost, but if not, there is no cost.

Predictably, this prints:

walk Animal\nfly Bird\nwalk Elephant

Digging a little up, we find that Rust has a similar concept, but it\ndoes away with using keywords like virtual or\nfinal to designate dynamic dispatch.

pub trait Animal {\n    fn name(&self) -> String {\n        "Animal".to_string()\n    }\n    fn act(&self) {\n        println!("walk {}", self.name());\n    }\n}\n\nstruct GenericAnimal {}\nimpl Animal for GenericAnimal {}\n\nstruct Bird {}\nimpl Animal for Bird {\n    fn name(&self) -> String {\n        "Bird".to_string()\n    }\n    fn act(&self) {\n        println!("fly {}", self.name());\n    }\n}\n\nstruct Elephant {}\nimpl Animal for Elephant {\n    fn name(&self) -> String {\n        "Elephant".to_string()\n    }\n}\n\n\nfn main() {\n    let animals: Vec<&dyn Animal> = vec![&GenericAnimal{}, &Bird{}, &Elephant{}];\n    for animal in animals {\n        animal.act();\n    }\n}

We create a trait of animal (kind of like an interface,\nbut supercharged) and then we create an instantiable version of it\n(GenericAnimal) that just takes the default implementation.\nThen we implement our Bird and Elephant and we\ncollect them into a vector and properly call the method on them. The\ncompiler assumes that we want dynamic dispatch by default and does it\nfor us. If not, we can call the parent classes\u2019 method:

Animal::act(&Bird); // this calls the Animal version of `act` with a bird.

This is similar to C++:

const Bird* bird = new Bird();\nbird->Animal::move(); // calls Animal::move with bird.

In short, here\u2019s a history of virtual functions and their syntax,\nstarting from C and ending with Rust:

In C, there\u2019s a rough way to emulate virtuals, but it takes some\neffort since it\u2019s not built into the language.

In C++, virtual functions were deemed useful enough to be built into\nthe language. This led to a terser syntax for overriding, but led to\nmore cognitive load on the programmer, since they had to choose\nvirtual or non-virtual implementations.

In Java, most methods are by default virtual, so the implementation\ndetails are hidden from the programmer. Java allows you to declare\nnon-overridable methods as final, so final\nmethods have no runtime cost, and all @Override methods\nhave runtime cost, which is a fair tradeoff.

In Rust, the compiler figures out if you want a virtual\nor normal method call through your trait implementations, but allows you\nto call the base method through a subclass if you so choose. This allows\nfor having even less cognitive load than in Java (no\n@Override or final necessary), but with an\nescape hatch to call the base class method in an overriden type (as C++\nallows).

The Strong Force and Weak Force of Companies

2020-08-18T13:19:06-05:00

There are four fundamental interactions in physics; gravity,\nelectromagnetism, the strong force, and the weak force. Particles are\naffected by all four of these \u2013 hence why they are fundamental to\nunderstanding physics. The strong force holds the nucleus of an atom\ntogether. The weak force tries to tear the nucleus of an atom apart,\ncreating radioactivity.

The strong force is a million times stronger than the weak force at\nsmall distances, up to 10 angstroms. However, as you add more protons\nand neutrons to the nucleus of an atom, the atom gets bigger - and the\nweak force becomes stronger and stronger, until eventually the nucleus\nof an atom is too big to be kept together by the strong force, and the\nweak force tears the atom apart, resulting in a radioactive atom.\nCompanies follow the same trajectory.

Companies are a lot like atoms; at first, growth is key \u2013 you want to\nextend your revenue by having people buy the service you offer them. How\ndo you build a better service? By hiring. So you hire people to improve\nyour service. At first, this works excellently. Your first few hires\nhave great impact at the company, and they lay the foundation for hires\nin the future. Of course, it\u2019s not all roses \u2013 they make some mistakes,\nand put in guardrails to make sure future employees don\u2019t make the same\nmistakes. But at this point of the curve, it is worth way more to you as\na company to continue hiring, since each hire gives you more value than\nyou pay them. But as time passes and you hire more people, each hire has\nless impact. They\u2019re encumbered by the communication cost of talking to\nmultitudes of stakeholders, and the cost of checking all of the\nguardrails their predecessors laid down for them. Eventually work slides\nto a snail\u2019s pace, and it shows \u2013 the product doesn\u2019t improve very much,\nif at all \u2013 features that might take a few days take months at a time\nnow, and your revenue doesn\u2019t increase as much as it used to. You\u2019ve hit\nyour first slope. Your competitive advantage, your strong force is\nfading. Everyone else is coming for you. The weak force is\nstrengthening.

You have a few ways of delaying the inevitable \u2013 maybe you find a\nnovel way to manage resources so you become better than the competition.\nMaybe you realize that some of the rules you\u2019ve enshrined don\u2019t do much\nfor you, so you cut excess baggage. Maybe you can\u2019t do the above, and\nyou cut excess resources in the form of employees. You\u2019ve managed to\nkeep the competition at bay, but your strong force weakens as you do\nthis. If you manage resources to become better than the competition,\nwhat stops the competition from copying you? If you cut excess baggage,\nyou run the risk of cutting out rules that were well-intentioned and\nprotected you from damage. You run the risk of plunging your company\ninto a dark age, where the employees have to rediscover the practices\nthat were purged from the annals of history to resume work. If you cut\nemployees, you risk the loss of morale that keeps your company churning.\nIf the company is willing to axe some people, what stops it from axing\nthe rest of them? Each time you do this, you come closer to your doom by\nthe hands of the weak force.

A company that keeps delaying the inevitable soon falls to its own\nforces. It blows up, metaphorically. People stop wanting to work at the\nfirm because there\u2019s too much bureaucracy. The product stop dominating\nthe market, and revenues fall. The consumers drop you for your\ncompetition. Your company blows up, becoming a waste zone.

Business is all about keeping the weak force away. As we\u2019ve made\nprogress in business, we\u2019ve learned how to do this better. Companies\nfind ways to make more revenue with fewer people, and do more with less.\nBut this too shall pass \u2013 the companies of yesteryear will soon be left\nin the dust. And so on and so forth. I wonder what the future of\norganization has in store for the rest of us?

The Business Curve

2020-08-08T10:25:54-05:00

In Zero to One, Peter Thiel proclaims \u201cCompetition is\nfor losers.\u201d \u2013 if you can make a monopoly, you should, because you can\nextract far more profit from it than if you participated in an open\nmarket.

In a free market, with perfect competition, you face this supply and\ndemand curve below. You have to sell the good at the price where supply\nmeets demand, and you can sell up to but not more than quantity where\ndemand meets supply. Assuming you\u2019re not the only seller of the good,\nyou have to settle for some chunk of the market.

Assume there are 5 merchants, and the amount of revenue you can bring\nin is the square created by the area of where supply meets demand, you\nget something like this:

All of you split the market somewhat equally, and you make 0\neconomic profit. Nobody can sell the good for more than it\u2019s\nworth (opportunity cost + cost of production), and nobody will buy the\ngood for more than it\u2019s worth (opportunity cost + cost of production).\nThe perfectly free market makes buyers and sellers equally well off.

This assumes that the first few products that a firm sells on the\nopen market are the cheapest for it to produce, and that there\nare buyers that are willing to pay more for the product than\nothers, because it is inherently worth more to them. The firms get to\nsell the goods that cost less for them to make at a higher price, and\nthe buyers who would\u2019ve bought the good at a higher price to get to pay\nless for the good. Everyone wins a little bit.

But firms want to make economic profit. This is where the\nreal money comes in. You can do this in many ways; organizations like\nOPEC (the Organization of the Petroleum Exporting Countries) band\ntogether every year and restrict the supply of oil on the open market.\nThis lets them charge more for their good (oil), since artificial\nscarcity drives up the price of oil.

Because of this, the market for oil looks something like this:

The utility that both sides get is:

Nice! We\u2019ve made economic profit. What we\u2019ve done here is\ntaken a large slice of the pie for ourselves; the buyers lose some\nutility, but in exchange the sellers get more utility. However, we\u2019ve\nlost some total utility. The area to the right of the seller\u2019s\nutility gain is now white. That\u2019s utility that we\u2019ve lost. This\nis called Deadweight Loss, because OPEC has now made some\nutility inherent to the market unreachable. The market as a whole could\nbe better off and capture that white area if OPEC decided to sell oil at\nthe market competitive price.

OPEC made the market better for the sellers, in exchange for hurting\nthe buyers, and worse for the economy as a whole.

In this case, OPEC is an oligopoly, or a case where there are only a\nfew sellers of a good. A monopoly is when there\u2019s only one, and a\nduopoly is when there are only two. All of these traditionally generate\ndeadweight loss, generating less utility for the buyer and more for the\nseller(s). This is why you hear how monopolies and oligopolies are\nbad and should be broken up by the government.

But OPEC isn\u2019t a maintainable oligopoly. Saudi Arabia doesn\u2019t trust\nthe other members of OPEC. Russia doesn\u2019t trust OPEC. In fact, OPEC\ndoesn\u2019t trust itself. The reason is simple: As a seller, you could make\nmore money if you undercut the market and sell more oil at a lower\nprice. If I sell oil at a lower price than the other countries, I get\nall of the profits. That entire red rhombus I had to share with the\nother sellers? All mine. And I expand it a little bit too. The rhombus\ngets fatter for me, at the expense of all of the other sellers.\nOligopolies have a cooperation problem \u2013 the economy cannot enforce its\nsellers to sell a good at a higher price than it values said good. So\nRussia, Venezuela, Saudi, and so forth go to each OPEC meeting wondering\nif this is the time the other countries backstab them and make off with\nall the gold. In OPEC\u2019s case, all of the countries have to be well armed\ntoo, because they don\u2019t know when the other countries might come to\ncapture their oil to increase their profits. Cooperation works worst\nwhen all parties are paranoid.

The problem with this model of economics is that it only somewhat\napplies to some goods in some places. This assumes we have symmetric\ninformation \u2013 that sellers know just as much as buyers. This also\nassumes that our goods are private and no different than each other. But\nreally, every good is somewhat different; coke isn\u2019t exactly the same as\npepsi, even though they are similar products. This model also doesn\u2019t\ncount network effects, where making a good become more popular has\nbenefits to all of its users and its overall utility.

If I wanted to buy some British soda, I would have to import that\nsoda to the US in order to buy it. That costs me both time and money;\nand so the soda costs more than it would if I simply wanted a coke. If I\nwanted a coke, I could just walk down the street since every grocery\nstore carries it, because of its popularity. Coke is cheaper than this\nBritish soda, Iron Bru.

A service like Uber really wants a monopoly for that reason; because\nif it had a monopoly, it would be able to have all of the drivers on its\nplatform available to serve all of the riders, and thus be able to get\neconomic profit. Alas, Lyft exists; and so, there are only half the\navailable drivers on Uber and half the available riders on Uber. Uber\nand Lyft are then also forced to price their service lower and lower to\nbeat each other out and win the loyalty of their riders. Buyers win\nhere, but the sellers are engaged in a race to the bottom. And only one\nwill survive, a la highlander.

If we assume that a monopoly is the only way to go to gain\nmonopolistic profits, we need to find a way to do so, without using\nforce (because that breeds paranoia). The only way out is by making a\nbetter product. If we make a better product, we can create a market that\ndidn\u2019t exist before, and gain monopolistic profits from that market. If\nthis product cannot be replicated, then there is no incentive to stop\nthe monopoly; because the market will only be worse off. Thus, no use of\nforce in any way can stop us from gaining monopoly profits. But building\nthe next Google is hard. Incredibly so.

You can, however, build a product that has differentiation in some\nway. You must have a feature that is hard and useful,\nand you too can ride a smaller but also profitable monopoly curve for\nprofit.

Language Pessimism and Optimism

2020-04-24T16:31:51-05:00

\n\n

Let\u2019s talk languages. Not languages like Swahili or Latin, but\nlanguages like C or Java. Programming languages. I would argue that the\nprogramming language you use day to day is the most important tool you\nuse in programming \u2013 if you use a modern language, it offers a lot of\npunch compared to something like assembly. The programming languages of\ntoday are cross platform, which gives you compatibility,\nmeaning you don\u2019t have to worry about writing your code for every\nplatform or architecture, because your language takes care of it for\nyou. It gives you abstraction, because you can write code that\ntakes a different shape than the way the machine sees it. Machines can\u2019t\nunderstand for loops or if statements, or even really anything more than\n0s and 1s. We made up things like functions and classes and interfaces\nand stuff like that. And these abstractions are really good. I\u2019ve never\nwondered when a for loop or if statement didn\u2019t work as intended in the\nlanguage. Programming languages also give you safety \u2013 they\nprevent you from doing bad things with your program. If I try to cast to\na non-compatible type, my compiler might catch that \u2013 if I try to write\nto some memory I don\u2019t own, my language or Operating system will stop\nthat. And that\u2019s just scratching the surface \u2013 there\u2019s plenty more to\nlove about languages. Higher order functions! Algebraic Data Types!\nOptimizations! Syntactic Sugar! It\u2019s a joyride all the way down. But\nwe\u2019re not going to talk about the nitty-gritty type theory of languages,\nno, today we\u2019re going to be talking about how people view languages in\nprogramming, by doing everybody\u2019s favorite thing \u2013 dividing people into\ntwo kinds of people, the x\u2019s and the y\u2019s, or the pessimists and\noptimists.

The Pessimists

The pessimists tend to say something along these lines quite often:\n\u201cGood code can be written in any language, and bad code can be written\nin any language. It\u2019s all about discipline when writing code.\u201d COBOL is\nas good as Java. People made great systems in COBOL, just like they made\ngreat systems in Java. And it\u2019s true. Much underpinning financial\nsystems were written in COBOL. Many COBOL systems are still out in the\nwild running, decades after they were written. And to the pessimists\u2019\ncredit, they work just fine. After all, if a language is Turing\ncomplete, it is as powerful as any other language. Basically\nall popular languages are Turing complete, so they\u2019re theoretically\nequivalent in power. Some languages are just better at certain things,\nbut they can all express the same things.

The Optimists

The optimists might say that languages are different. An optimist\nmight say that they prefer C++ to C because it has collections and\nObject Orientation. An optimist might enjoy using smart pointers to aid\nin memory management. An optimist might like the way that python handles\niterable collections. These all aid in code readability, because the\nabstraction is easy to understand and encapsulates more functionality in\nfewer lines. Maybe an optimist likes static types, because they make it\nexplicit what types of data a function or method might return, or make\nit easier to understand the way data transforms throughout its\nlifecycle. In the optimists\u2019 eyes, there are abstractions which aid in\nthe expression of code, and since different languages choose different\nabstractions to make the expression of some types of programs harder and\nsome easier, there are some differences between languages. You can\nprogram in an object oriented style in C. You can program in a generic\nstyle in C. These are all far less compact than their equivalent C++.\nThere is an aesthetic difference, if not theoretical difference.

A Twist

If you follow my points up above, I\u2019m promoting the idea that\nlanguage optimists prefer the aesthetics of a language, whereas the\nlanguage pessimists prefer the theoretical power of a language.\nTheoretical power of a language is not bounded by expressiveness,\nhowever. If it took you 100 lines of code to read from a file in a toy\nlanguage (let\u2019s call this language Airplane), but 1 line of code to read\nfrom a file in another language (let\u2019s call this language Ship), Ship\nand Airplane would be theoretically equal in power. But in practice,\nmost people would prefer to use Ship for reading files. 1 line of code\nis 100x less than 100. If this was the case for everything (let\u2019s say it\ntook 100x more lines to write any possible program in Airplane than in\nShip), you would probably prefer Ship. In terms of expressiveness, Ship\nis better than airplane.

Ship is still not more powerful than Airplane; it is simply\nmore expressive.

But that\u2019s still just aesthetics; what if there was a class of\nproblems that Ship did not suffer from that airplane suffered from.\nLet\u2019s say that Airplane and Ship are both Web programming languages, and\nif you fail to write \u201csafe\u201d Airplane code, you might allow some users to\nread information that belongs to another user\u2019s. Let\u2019s say in Ship, this\nproblem doesn\u2019t exist \u2013 Ship wouldn\u2019t let our code compile if it could\ndetect the possibility of this bug.

Even with this added on, Ship is still not theoretically\nmore safe than Airplane; a correctly written Airplane program is as\nsafe as a Ship program that compiles. But I would say in this case,\nmost programmers would prefer Ship. Even though Airplane is\ntheoretically as safe as Ship, a language that\nprovably lacks a class of errors is at least marginally better than a\nlanguage that may probabilistically have a class of errors.

Now the twist is, is Ship in a different class of language compared\nto Airplane? Does safety matter?

Choose your own adventure

It\u2019s up to you. Theoretically safety doesn\u2019t play a part in\nprogramming. God awful safety vulnerabilities and memory leaks don\u2019t\nmatter because, while they cause us and our users to be sad, and might\neven cause us to go to jail, they don\u2019t make our program any less\npowerful. A program doesn\u2019t care if you go to jail or not, actually, and\nneither does math. If you are concerned with this definition of power,\nthen Airplane and Ship are indeed equivalent. But maybe you have a more\npractical bent. If so, Ship is probably the better language \u2013 hey, you\ncan express yourself with 100x less code and it has more safety built in\nthat Airplane.

Is Safety Preference?

That leads me to my last point. Is Safety Preference? Is a safer\nlanguage a categorically different language than one that an unsafe\nlanguage? I would argue so. Most programmers these days will probably\nnever touch a memory unsafe language (C and C++ being the most popular\nof these). There\u2019s a whole extra class of errors to account for in these\ntwo languages, and when you choose a garbage collected language, most of\nthe time you don\u2019t have to worry about memory. That helps you out with\ndelivering better safe software. And that\u2019s great. It makes you more\nproductive. But maybe you need to have performance instead of safety,\nand this is where most people go for performance. Safety comes at some\ncost, and you can\u2019t always have safety. But if you can have safety, you\nmight as well buy into it, since it lessens your cognitive load.

Performance Matters

2020-03-29T13:51:41-05:00

Asking programmers about performance generally leads to two trains of\nthought. Either it doesn\u2019t matter, and there are other metrics to chase,\nor performance matters. Most of the time, those who think that\nperformance matters can be broken down into two different streams of\nthought. One group thinks that performance matters because it saves\nmoney \u2013 if your server costs are a million dollars a month and you make\nthe whole system 10% more efficient, you\u2019ve reduced server costs by\nabout a hundred thousand dollars. It\u2019s nice to save that money. The\nother group is more intense about it \u2013 performance is the only thing\nthat matters because it allows us to create programs that solve problems\nthat other (less efficient) programs would simply not allow us to.\nPlenty of hot-path code is written in low-level highly optimized\nlanguages.

All three ways are true in sometimes: sometimes you don\u2019t need to\nlook at code under a microscope because the underlying runtime will\noptimize it, sometimes saving money may save the company, and sometimes\nperformance is the only thing that matters. But I\u2019m going to say that\nperformance has and always will matter because of the fact that hardware\nwill always get smaller, and we want to do more with less. It turns out\nthat there is a great analogue for this in chemistry already.

In chemistry, atoms are held together by gravitational forces. But\nthere are two forces at work, the strong and\nweak force. The strong force keeps the nucleus\nof an atom together, whereas the weak force tries to\ndisperse the protons and neutrons of the atom. As you\u2019d expect, the\nstrong force is stronger than the weak force\nfor smaller elements. But this begins to break down as atoms get bigger\n\u2013 the strong force is not asymptotically stronger than the\nweak force. As such, there are radioactive elements like\nUranium-235, where the weak force overpowers the\nstrong force and causes the element to become radioactive.\nRadioactive elements eventually tear themselves apart, and split into\nsmaller elements at the end of their half-life.

Hardware has historically (and I don\u2019t think this will change much)\nbeen the same way. At first, computers like the ENIAC were the size of a\nhouse, (wikipedia says 1,800 sq ft), and consumed 150kW of electricity\nto perform a whopping three square root operations per second. At first,\nthis is wholly insufficient for most of the work needed to be done on a\ncomputer. So research was done to optimize hardware to the point where\nit was able to be made smaller. That led to the micro-computer, a\ncomputer the size of a desk, which could do some interesting tasks like\nword processing or simple text games. Eventually we found that we had\nenough performance that we could stick a micro-computer into a smaller\ncomputer, one that was portable, and that was called a laptop. Then the\nhardware advanced to the point where it could fit into your pocket, and\nthat became the smart phone. Tablets branched out to be in the middle of\nsize (and therefore compute) of a laptop and a smart phone. Our need for\nsmaller and more efficient hardware has made it so we couldn\u2019t throw\nperformance by the wayside; every time performance was \u201cgood enough\u201d,\nthe hardware got smaller until performance mattered again.

If you believe history will continue, then we\u2019ll have even smaller\ndevices with even more strict real time capabilities. Doctors have been\nasking for more efficient tools to treat patients with for a very long\ntime, for good reason. More portable and cheaper tools for doctors allow\nthem to treat more patients at less cost. More powerful tools allow for\ndoctors to find signs that they might\u2019ve missed with less powerful\ntools. A new generation of video gamers want to bring their games on the\ngo \u2013 as such, the mobile video game industry has blossomed apart from\nthe traditional console based games. VR headsets are now being used for\ntherapy, and this is extremely performance sensitive \u2013 if the program on\na VR headset is slow by even milliseconds, the user may become fatigued\nor nauseous, which defeats the purpose of the technology.

History has always been about doing more with less, and I don\u2019t see\nthat changing for the near future. Maybe when Moore\u2019s law stops becoming\ntrue?

Who Gets Stuff Done

2020-03-12T11:52:45-05:00

\n\n

Software developers are generalists. Ask the average software\ndeveloper questions about networking, databases, compilers, operating\nsystems, data structures, distributed systems, and 9 out of 10 will tell\nyou that they know something about them. But this generalist mentality\nbegins to break down once you acknowledge one of the most fundamental\nthings rules of life:

Nobody can be an expert at everything.

And if nobody can be an expert at everything, we eventually must have\nroles for our field. One mechanic alone cannot create a high performing\nand standards compliant car these days. It\u2019s just too hard. Likewise,\nsoftware development is too hard for any one person to keep the whole\nfield in their head. And that\u2019s a good thing. It means that we have\nroles for who does what, and that helps teams move quickly, without\ngetting bogged down in the minutae.

So we have well defined roles for the work that we do on our teams,\nlike Front-End developer, Back-End developer, DevOps, SysAdmin,\nDesigner, Product Manager. All is dandy in the world. Well, all would be\ndandy if there wasn\u2019t a question of what makes an developer a\ndeveloper.

Once a field becomes sufficiently mature, the practioners of the\ncraft, (the real intense ones anyway) start sharing a common idea of the\nideal practioner. First this starts out as vague and easy to achieve\nideals, like \u201ca real developer would be able to answer fizzbuzz in 30\nseconds\u201d, or \u201ca developer should be able to understand one programming\nlanguage well\u201d, which is all hunky dory. And soon, a regulatory body of\npractioners, filled with the people who subscribe to that ideal the best\nis borne, and they create norms and disseminate them to the rest of the\npracticing population. Psychologists in America have the APA, doctors to\npass a board certification, and lawyers have to pass the bar in each\nstate. I\u2019ll call the idea of the ideal practioner the \u201csoul\u201d of the\nfield, and the regulatory body (a la APA, bar, or other) the \u201cbody\u201d. If\nall is in harmony, the \u201cbody\u201d and \u201csoul\u201d are in alignment \u2013 the\npractioners agree with the higher ups on what should be taught, and what\nconstitutes a \u201creal\u201d practioner.

The system I\u2019ve described above works great when \u201cbody\u201d and \u201csoul\u201d\nare in tune \u2013 there is buy-in from both sides of the table. But\nprogramming has never had that. And it\u2019s because the field is too\nvaried, one that should\u2019ve been split up (and probably will be) in the\ncoming decades. You see, development doesn\u2019t fit so neatly like other\nprofessions do in this model, because at large, there aren\u2019t two groups\nof programmers. There are three. Academics, Systems Programmers, and\nApplication Programmers.

In computer science, academics do research on problems that are\nremarkably forward thinking \u2013 as you read this, papers are being\npublished for increasing the speed of low-level hardware, operating\nsystem calls, database reads and writes, distributed systems,\nprogramming langauges, AI, statistics, quantum computing, cryptography,\nwhat have you. These all have wide impact, maybe even decades later \u2013\nBarbara Liskov (of the Liskov substitution principle) was working on\nobject oriented languages in the 60s, some 30 years before they made it\nto the mainstream for practioners in the form of java. Liskov as well as\nLampert made long lasting contributions to distributed systems research,\nwhich has changed the way practioners have built their infrastructure,\nand has allowed companies like google and amazon to become global\ncompanies. RSA encryption, made in the 70s, is widely used today. There\nis a long tail of important and groundbreaking research, but you get the\ngist \u2013 Academics do important research.

Systems Programmers are the ones who, when they see a reason to,\nimplement the academic\u2019s work. Linux implements some of the cutting edge\nof operating systems research. Zookeeper implements the consensus\nalgorithms that academics envisioned from the 70s. OpenSSL implements\nRSA encryption, and many others. System programmers transfer the\nabstract, theoretical world of theorems and proofs into libraries and\npackages for the rest of us to use.

Application programmers are the ones who take the work of System\nProgrammers and create products and applications that are used by the\nworld at large (read: not tech-savvy people). They deal with the nitty\ngritty of presentation and User Experience, and find creative ways to\nuse the tools they have to make products that require very little\nin-depth knowledge of the product to use.

With these three camps, it is impossible to have either a \u201csoul\u201d or\n\u201cbody\u201d for programmers. I\u2019ll list out the ideal \u201cbody\u201d and \u201csoul\u201d for\neach of the three camps.

Academics:

Body:\n
- An academic body that tests developers on the rigor of their proofs\nand theoretical knowledge.
Soul:\n
- A practioner that thinks from first-principles to expand the rigor\nand breadth of the field.
Education:\n
- A higher education (Masters or PhD)

Systems Programmers:

Body:\n
- A coalition of workers who stress low-level (in code) fundamentals,\nand tool building for performance.
Soul:\n
- A practioner who doesn\u2019t necessarily need to produce academic\nresearch, but can pick up academic works and translate them to libraries\nand packages.
Education:\n
- A degree in the field (Undergrad, Masters, or PhD)

Application Programmers:

Body:\n
- A group of programmers who build products for the general\npopulation.
Soul:\n
- A practioner who can pick up a wide variety of tools, and knows how\nto use them to quickly create the required product.
Education:\n
- Varied.

All three groups are in constant conflict, and this leads to the\nchaotic state of software development \u2013 at one end, the Academics aren\u2019t\neven implementing software \u2013 and at the other end, Application\nprogrammers stress a high-level knowledge of a variety of areas.\nAgreement isn\u2019t necessary in this case, but it is something that would\nbenefit largely in one area. Interviews.

The Hiring Bar

After being interviewed for days by Bell Labs, a young up-start\ncomputer scientist named Bjarne Stroustrup found a job at one of the\nmost coveted research labs in the nation.

After being interviewed for weeks by an Unnamed Big Firm, a young\nup-start new graduate named ${GENERIC_NAME} found a job at one of the\nmost coveted firms in the nation.

See any relation? You should, because I made it painfully obvious.\nBig firms subject prospective candidates to a brutal interview loop,\nconstituting of knowledge of low-level operating systems, compiler\ntheory, algorithms and data structures, distributed system design, and\nothers. This all makes sense if you want a software developer who will\nwork on all of these areas, but most firms do not actually need their\ncandidates to know this knowledge on the job \u2013 it is abstracted away\nfrom them by the work of Systems Programmers who provide (mainly) good\nlibraries to base work off of. Maybe Untitled Big Firm has does have\nproblems of scale \u2013 but your average start-up does not. And yet, the\nhiring process continues this way.

The big firms might be justified for want of unicorn talent (after\nall, they\u2019re willing to pay for it), but most firms simply cannot afford\nto pay the compensation that these developers are worth these days. And\nyet, the hiring process continues, and I hear companies complain on\nLinkedIn about how hard it is to hire and retain good developer talent.\nTo those companies, I only have a few choice words: buck the norms, and\nfix your interview process.

I\u2019ve heard countless stories of acquaintances not passing an\ninterview because they were asked questions that were outside of their\nspecialization, which matched the job. One acquaintance with an interest\nin robotics and hardware was asked about implementing a Todo CRUD app.\nHe failed. Another friend was asked about low-level disk write system\ncalls for a React position.

I think this happens because companies have a mistaken perspective of\n\u201cwho gets stuff done\u201d, A.K.A \u201c10x engineers\u201d. They assume that to be a\n\u201cgreat\u201d developer, you must understand everything about your computer,\nand that translates to great code. That is not true. A good developer\nknows the appropriate level of abstraction for the task at hand. Asking\na systems programmer about application programming, or vice-versa, is a\nsurefire way to destroy your hiring pipeline. The best companies know\nthis, so they don\u2019t do that. They hire \u201c1x\u201d engineers, and make it to\nmarket \u201c10x\u201d as fast as the other firms. Those firms win. The product\npeople love the devs because they crank out bug-free code in record\ntime, and the customers love those companies because they make genuinely\ngood products.

Always value getting stuff done.

10 Predictions

2019-12-26T19:56:17-05:00

\n\n

In a few more days, the 2010s will be over. Lots has changed in the\nprogramming world \u2013 Java is no longer king, but JavaScript, being the\nmost popular language (according to stack overflow) for 7 years\nstraight. NoSQL databases like MongoDB, Redis, and Cassandra have become\nexceedingly popular, as well as front end web technologies such as\nAngular, React, and Vue. Kotlin has become the preferred language for\nAndroid development, along with Swift for iOS. SaaS companies are\nubiquitous, and Marc Andreessen\u2019s prediction of \u201csoftware eating the\nworld\u201d rings even more true today.

Let\u2019s hope that the 20s are an even wilder ride for software\ndevelopment, and to that end, here I\u2019ve decided to compile ten\npredictions for the next decade as a fun little exercise. As a reader of\nthis article, I encourage you to do the same (I\u2019m looking forward to\nseeing how our predictions are different or similar!)

Most of these predictions will be wildly incorrect, but I think this\nis a good excuse to think about what could be coming in the future.

1. Self-driving cars will still be two years out

Before anyone asks, I\u2019m talking about level 5 meaning fully\nautonomous, you could take a nap in the backseat of the car with no\ndriver autonomous. There is a lot of interest in this space, and for\ngood reason \u2013 Large companies like Uber, Lyft, Waymo, and Tesla have\nbeen researching self driving cars for the good part of this decade.\nThere are many technical concerns regarding self-driving cars, but I\u2019m\nactually fairly sure they\u2019ll be solved this decade. Legality is a huge\ngray area. Should self-driving cars never have an accident? If that\nblocks general availability, self-driving cars will never make it on the\nroad. But if the government allows self-driving cars as long as they\nhave less accidents than human drivers, I think there\u2019s a shot they make\nit this decade. But I think the real problem is that regulation will be\nlagging behind.

2. Rust will become a top 10 language in popularity

According to 2019\u2019s Stack overflow\u2019s developer survey, the tenth most\npopular language is TypeScript, which has 21.2% of respondents\nprofessing usage. Right above that is C++ at 23.5%, and right below is C\nat 20.6%. Rust is currently at 3.2%, sitting just below Scala. Of the\nten languages, Typescript is the only one younger than a decade, but it\nhas already edged out C in popularity. Rust is the only language that\ncan save us from C and C++ supremacy in high performance computing. It\nhas a couple of famous backers (like Mozilla and Amazon), and it has won\nmost loved language for 4 years running on Stack Overflow. Allowing\naccess to low level computing without unsafe abstractions is a real\ntreat. Every generation of programmers flocks to a new way of doing\ncomputing \u2013 C saved us from assembly, and C++ followed to tack on object\noriented programming and RAII. Java popularized garbage collection, and\nlanguages such as Javascript, Python, and Ruby added higher level\nfunctional abstractions to the mainstream. I think this next decade will\nsee the rise of Swift, Kotlin, Go, and Rust to the top 10 of\nlanguages.

3. WebAssembly will kill JavaScript and Desktop Apps

I don\u2019t mean kill kill, like how C did away with Fortran. I expect to\nsee JavaScript as a top 5 language still by the end of this decade, but\nI think WASM is too much of a game changer to ignore. Applications with\nhigher performance requirements are gated from the web because\nJavaScript is the only front-end programming language \u2013 you simply can\u2019t\nhave a garbage collected interpreted language be high performance.\nWebAssembly changes all of that. Compile Rust, Go, C or C++ for the\nfront end. Games, exiled to desktops after the death of flash, can come\nback to the web. Developers with high CPU requirements (like AI/ML apps)\nwill most likely find their home on the web this decade. I expect\nsomething similar to npm popping up, but for all kinds of packages in\nall kinds of languages, widening the range of the web.

4. JSON will be replaced with a Typed Transfer Protocol

JSON has been around for 20 years \u2013 but I don\u2019t expect it to be\npopular for another 20. While I enjoy working with JSON APIs, I think\nthat choosing a JavaScript based transfer protocol is a double edged\nsword. Sure, it rose in popularity because it\u2019s just like Javascript,\nbut JavaScript is fast and loose, something not all programmers\nappreciate. Tools to facilitate typing and strictness have popped up for\nJSON, but sometimes it\u2019s better to start from the ground up. I expect\nsomething like YAML with strict typing becoming more of the standard by\n2030.

5. Functional Programming will finally become popular

Programming in a functional style has become all the rage recently,\nbut people still haven\u2019t adopted functional languages into their\ntoolkit. Of the three tenets of functional programming, most mainstream\nlanguages have accepted two, functions as first class citizens, and\nstronger typing. Immutability is hard to implement if all of your data\nstructures are mutable, so that\u2019s a non-starter. I just want to be able\nto talk about algebraic data types at a meetup without being an outcast,\ndarn it!

6. Microservice hype will wear off

This one will probably seem crazy to half of the readers, and obvious\nto half of the readers. Microservices are great because they encourage\nlooser coupling. Unfortunately, looser coupling also requires more code.\nAnd the worst thing you could do to your codebase is increase the amount\nof code it has. If monoliths create tech debt gradually by entangling\neverything in their grasp, the sheer amount of code microservices\nintroduce will blanket your entire organization.

7. Facebook will no longer be a top 10 company

Facebook made one good product 15 years ago. The other two products\ndriving most of their profits, Instagram and WhatsApp were acquisitions.\nAmong the youth, Facebook is unhip and ancient. You know, just like\ncommodores are artifacts of a bygone epoch. Zuckerberg is smart, but I\ndon\u2019t think good acquisitions (which saved Facebook this decade) will\nsave them the next decade. We\u2019ll see though.

8. An AI startup will become this decade\u2019s hottest startup

Last decade\u2019s hottest startups, like Uber, or Lyft, or AirBnB created\nthe gig economy. I expect AI to try to coax the gig economy into its\ncoffin.

9. Blockchain will be this decade\u2019s beanie babies

Blockchain has been gaining ubiquity as a secure new way to exchange\nfunds, but I don\u2019t see it taking off just yet - It strikes me as an idea\nthat is too early for its current decade.

\n\n

10. You and I will host our apps on a new cloud provider (read. not\nAWS)

AWS became extremely popular this decade - and I don\u2019t\nexpect the service to die this decade. While it\u2019s great for enterprise\nby thinning the IT department, it\u2019s not made for you and me. It\u2019s\nconfusing, for one, with configuration hiding behind every corner, ready\nto jump out and spook you. Oh yeah, and it costs a lot.

\n\n

Why are there so many Software Engineers?

2019-04-14T20:56:58.198Z

\n\n

Hardware vs.\u00a0Software

Have you ever wondered why so many people are software developers\ninstead of hardware developers these days? I certainly have. And even if\nyou haven\u2019t, I\u2019ll go ahead and give you a cut and dry example of why\nthat is, and why software is considered to be the way of the future (for\nnow). I\u2019ve never studied anything about either hardware or software, to\nbe fair \u2013 hardware is black magic and software is a black box, so take\nmy narrative and reasoning with a grain of salt.

What is Software

First off, we\u2019ll have to define what software is. I really don\u2019t have\na great definition to give you, but I\u2019m sure you know what software is\nanyway \u2013 it\u2019s anything above hardware, and hardware is anything below\nsoftware. I jest, of course. Software is defined as operating\ninformation used by a computer. These days, any kind of programming that\nbuilds on an operating system (OS) is software programming. To me,\noperating systems are the proverbial line drawn in the sand between\nhardware and software \u2013 an OS takes your instructions and etches them\ninto the hardware. It turns software actions into hardware truth, a sort\nof bridge between hardware and software.

What is Hardware

This time I\u2019m better prepared to read out the definition of hardware!\nPut down your pitchforks. Ahem. Hardware is \u201cthe machines, wiring, and\nother physical components of a computer or other electronic system.\u201d\nSound good? Hardware is all around us. I\u2019m writing this article on a\nMacbook Pro, I ride the bus to and from work, and I even use a microwave\nfrom time to time (a lot of the time, actually). These are all examples\nof hardware with software interfaces that expose an interactive API.\nWhen I click the button to reheat my pizza, the microwave sends\ninstructions to the hardware of the microwave (turn on, use this much\nelectricity, start spinning, turn off, etc). Hardware is everywhere\naround us.

Why Software

Now that we have some definitions out of the way, let\u2019s take a trip\ndown memory lane. Long, long ago there were not that many computers and\nhardware was very expensive. One of the first popular programming\nlanguages was Fortran, which ran on the IBM 704 mainframe computer.\nComputers were the size of rooms back then, and this super computer\npacked a whopping punch. It had a jaw-dropping memory of about 18KB and\nweighed 10 tons. Whoo wee. We\u2019ve come a long way, haven\u2019t we? Anyway,\ngiven the cost of these computers and the amount of electricity they\nconsumed (thousands of dollars worth an hour), it made sense that it was\nimportant to optimize for hardware time, not programmer time. See, a\nteam at IBM had access to one of these. That means a team of smart,\ncapable individuals would all have to share one computer. They\u2019d fuss\nfor hours about making faster algorithms, because it mattered. A lot.\nMost people programmed in pure machine code in those days because a\ncompiler couldn\u2019t come close to the amount of optimization a team of\nsuper smart people would be able to do. Clearly, hardware was the\nconstraining factor here, not labor. And for many years that was true \u2013\ninterpreted languages (such as lisp) were looked over because they ran\nthousands of times slower than native machine code. Needless to say,\nmost everyone programmed in machine code, and that was the way things\nwere.

But eventually, that changed. Think about Moore\u2019s law. The number of\ntransistors in a circuit doubles every two years. With the increase in\ncomputing power of hardware, programmers were freed from having to use\nmachine code for everything. Eventually they used assembly, and then\nhigher level languages like C, and nowadays, extremely high level\nlanguages like Python, Ruby, and Javascript have risen to prominence.\nSure, Javascript runs many orders of times slower than machine language\nor assembly, but with every passing year, higher level languages rise in\npopularity. Every year, our hardware gets better. That means every year,\nhardware costs less. Thus, programmers are freed from having to optimize\nevery facet of their program. Oddly enough, programmers are incredibly\ncheap to hire these days. Give them a $1500 computer, two monitors, a\nkeyboard and mouse, a chair and desk. That\u2019s about $3000 for all the\nhardware they need. But the company needs to pay their salary, which can\nrange quite heavily, but it sure costs a lot more than $3000 a year, let\nme tell you.

So the trend is quite apparent for right now. Hardware costs less and\nless with each passing year, whereas software costs more and more.\nSoftware took over hardware as the restricting factor in development,\nand ever since then, there\u2019s been a boom in software jobs and more of a\nglut in hardware jobs. And it doesn\u2019t seem to be stopping anytime soon.\nSteve Jobs famously told Barack Obama to lower barriers in immigration\nbecause Apple just couldn\u2019t hire enough talented software engineers.\nSalaries have skyrocketed for software developers, and it looks like\nthey won\u2019t be falling back to earth soon, as long as companies have too\nmuch demand for software developers with far too little supply.

Conclusion

So the trend seems to be that we\u2019ll need more and more software\nengineers, and maybe hardware won\u2019t grow as fast. Even with the\nexplosion in mobile devices in the past decade, there can be thousands\nof apps on one phone. That means there\u2019s one team of hardware engineers\nfor every thousand or so teams of software engineers. So, perhaps, the\ndirection from now on will be more software-oriented instead of\nhardware-oriented. Or, perhaps, there\u2019ll be another change. Maybe\nhardware will see a revival when we all become cyborgs in the near\nfuture. Who really knows?

Progressing in Programming

2019-04-11T20:41:20.201Z

\n\n

I\u2019ve always wondered what the power of journaling was \u2013 I\u2019d never\nbeen the type to write all of my goals in a journal, and plan\nmeticulously about what I was going to do \u2013 I took everything at face\nvalue and jumped on every opportunity as it came. But no longer! It\u2019s\ntime to begin blogging about progress. We\u2019re going to make some gains.\n(Not of the gym kind, of course).

Before we begin to do any growing, we have to know what to grow in!\nSo we take this big goal and shrink it down into bite-sized pieces. So\nmy goal is to become a better software developer by the end of the year.\nMy reasoning is that understanding more about the basics of programming\nlanguages will help me out on the day to day, helping me understand what\nI need to do in order to become a better software developer.

So then, what is there to learn? Well, too much. But I\u2019d like to\nimprove at certain parts of programming that are ubiquitous in the\nfield.

C Programming

Ah, C. The language of Unix, the programming language that runs Unix,\nthe most influential Operating System of all time. Such a powerful\nlanguage, with just 32 reserved words, you too can build your own kernel\nin just a few thousand lines using this wildly expressive language. So C\npretty much is the de facto language for compilers and Operating Systems\n(since they\u2019re both built in C), and this makes it a great first\nlanguage to sink my teeth into. The way I\u2019ll do it is by reading through\nProgramming in C (4th edition), which should be coming in the mail soon.\nI\u2019ll be littering this blog with some tidbits I learn from that book,\nand summarizing the key concepts there.

Operating Systems

You\u2019ve noticed I\u2019ve mentioned operating systems in the section above.\nAnd of course, operating systems are invaluable to us. No one these days\ninteracts with a computer without an operating system. So of course,\nlearning about Operating systems should be a high priority \u2013 especially\nUnix based ones. For that, I\u2019ve picked up a copy of OSTEP (Operating\nSystems: Three Easy Pieces) to help aid in learning about virtualization\n(the act of taking the physical hardware and making a virtual API to\ninteract with it, like writing to files, or editing files, spawning\nprocesses), concurrency (the act of turning this single threaded\ncomputer into one that can run multiple processes at once), and\npersistence (finally saving our actions to a hard disk, so they won\u2019t be\noverwritten on next boot up). Operating systems are considered to be a\nhard topic, so I\u2019m looking forward to the challenge! Maybe as a final\nproject, it would be cool to make a small OS that could support file IO\nor do something like that.

Compilers

Next up is Compilers, that thing that compiles your code and turns it\ninto machine language. I don\u2019t know too much about compilers (hey, I\u2019m a\nJavaScript(JS) guy, I use an interpreter), but I\u2019ve always been keen on\ncompilers ever since Babel entered the scene. If you don\u2019t know what\nBabel is, you\u2019ll have to know a bit about front-end web development. One\nof the pain points of front-end development is that we, as front-end\ndevelopers, have to support fairly old browsers, and they support\ndifferent features of HTML, CSS, and JS. For example, IE8 supports\nHTML4, CSS 2.1 (that means no media queries) and ES3 (the version of\nJavaScript that was standardized in 1999\u2026). Currently, Google Chrome\nsupports HTML5, CSS3, and ES10 (the version of JavaScript that was\nstandardized in 2019). So of course, we could all painfully write our\nES5 JavaScript to be backwards compatible with IE8, or we could ditch\nbrowser support for IE8 (a lot of firms have), but there\u2019s still a large\nuser base that uses IE11, so it\u2019s a hard sell to drop support for that.\nIE11 supports HTML5, CSS3, and ES5, which is the 2009 standard of\nJavaScript. You get the idea. Lots of browsers to target, but to target\nthem all, we\u2019d have to use some subset of JavaScript from 10 years ago.\nNot very fun. Many years ago, a solution named Babel was released \u2013 it\ntakes the JavaScript that you write, and turns it into valid es5\nJavaScript code. It\u2019s a compiler for JavaScript, but really, it\u2019s\nperhaps the best open source project out there in the JavaScript\necosystem. And to understand Babel, you have to understand compilers.\nCue the JavaScript music (Babel actually has a theme song, a cover of\nJeff Buckley\u2019s Hallelujah). I\u2019d like to write my own small language too,\npreferably in C, so I can understand the nitty gritty of creating a\ncompiler, and getting closer to the metal, if you will.

C++ Programming

Ah, C++. When you program in C++, every problem looks like your\nthumb, and every goto leads you to a black hole. I\u2019m kidding, by the\nway, I don\u2019t have any experience writing C++, so I can\u2019t tell you much\nabout the language. It\u2019s also very popular, with uses in pretty much all\nindustries, but especially in game development. It\u2019s an excellent way to\nlearn about Object Oriented Programming, something I\u2019m really lacking,\nand it has a wide standard library (again, something that JavaScript is\nsorely lacking), which handles plenty of use cases, such as\nstd::vector<T, T> for lists, and std::array<T, T> for\narrays. What more could you ask for?

Java Programming

Java is the best and worst thing that\u2019s happened to programming in\nthe past 20 years. Any takers on that statement? Steve Yegge said it.\nBut it\u2019s interesting, you see, because when one refers to Java, it\u2019s\nkind of hard to see if someone\u2019s referring to Java or the Java Virtual\nMachine(JVM). Java has so many people who live and die by it (in more of\nthe metaphoric sense), it has so many zealots that swear by it and\nnothing else, and become exalted to memetic heights \u2013 the old timey\npeople used to tell me about how Java saved them from environment\ndependent variables and worrying about memory management and pointers.\nAnd it\u2019s true that Java, specifically the JVM is amazing at abstracting\naway all the stuff that we really shouldn\u2019t be managing anyway, but one\nconcern is that Java is just such a slow moving language \u2013 it was in the\nSun Microsystems days, and it still kind of is in the Oracle Days. But\nstill, Java is a key language to at least know the basics of, and I\nmyself still admire how forward thihnking the Sun Microsystems team were\nin developing both the JVM and Java. Truly wonderful. A+ Language\nindeed.

Go Programming

Golang is the best programming language to come out in the past 10\nyears. How many takers do I have for that statement? Go reminds of C,\nbut brought into the 21st century \u2013 it has garbage collection (yay!), it\ncan be run or compiled (the best of both worlds), the compiler can\nassume what types you\u2019re using, so you can say x := 5\ninstead of the slightly more verbose int x = 5;, there are\nno semicolons (the compiler adds them at the end of each line for you),\nand it has a very small API, with a more modern standard library so that\nyou can do what you need to, without having to know every nook and\ncranny of some arcane language (see: C++, Java, JavaScript). And it has\na cute mascot. That\u2019s the real kicker, to be honest. Java\u2019s mascot, the\nJava Duke, looks like its from a 17th century sketchbook. Python doesn\u2019t\nhave a mascot, just like JavaScript or C. Chalk a win up for the boys\nand girls at Google, their marketing is superb.

Python Programming

Python is one of the most loved languages these days \u2013 and I will say\nwithout a doubt, that it is an amazing language. Guido really outdid\nhimself. But one thing I don\u2019t get is why there\u2019s a fork in the\nlanguage. Why is there Python2 (which this Mac runs) and Python3 (which\nthis blogger uses)? And the syntax is a bit different from the usual C\nderivative language syntax, but I guess that\u2019s a good thing about\npython, it has so many great things about it (such as lambdas from\n1994), and comprehensions (everyone\u2019s favorite feature of functional\nprogramming), and a killer ecosystem (I instinctively etch\nimport numpy as np for almost every .py file I write). And\nit\u2019s a scripting language that can automate everything from emails, to\nfile moving, to web development, to even cars. Why don\u2019t people like\nscripting languages more?

Data Structures and Algorithms

Data Structures and Algorithms are perhaps the most sought after\nconcept in interviews, which has led a lot of people to discount the\nprocess entirely, but you know, I don\u2019t blame tech firms for using data\nstructures and algorithms questions to hire applicants, and here\u2019s why.\nLet\u2019s say you want to hire for a back-end software engineer, and your\nstack is in python\u2019s django, for example. Almost no one will have prior\npython django programming experience, so you have two choices. Either\nyou hire and test all applicants for python django knowledge (which\ncould take your team a year to fill those three vacancies on your team)\nor you could simply test on concepts that are familiar to backend\ndevelopment \u2013 ask them questions like \u201cwhat does MVC mean? How would you\nhandle real time communication needs in an app?\u201d, or even easier, say\nyou don\u2019t even have to write any code, we\u2019ll test you on ideas. And\nthat\u2019s really the core of what the Data Structures and Algorithms\ntechnical interview does, it tests some common knowledge among all\ndevelopers. So you know, it\u2019s not so bad. To get better at Data\nStructures and Algorithms, I\u2019ve got Elements of Programming Interviews\n(EPI in Python), Daily Coding Problem, and Algorithms by Skiena all\nlined up and ready to go. Hopefully by the end of the year I\u2019ll be a\nbetter problem solver then.

Well, those are the areas I\u2019ll be covering, and maybe documenting\nhere and there through this blog. Hopefully I\u2019ll be a better software\ndeveloper, and maybe I\u2019ve encouraged you to be a better developer.

HTTP Servers

Redis

Sqlite

Disk

Hashing

Networks

Talking about Systems

Google Search

Amazon S3

Conclusion

Index Implementation

Searching our Index

Giving it a test run

Current

Past

Implementation

ARM64

x86_64

Conclusions

What are System Calls?

System Call Numbers

Writing system calls (with assembly)

Writing a C Function that makes a system call

Putting it all together

Reading and writing to files

Setup

Reading Older RSS Posts

What about Atom?

Rendering more content with morss

Papers

Links

Courses

Books

Utilities

Programming

Textbooks

Courses

Papers

Projects

Discrete Math

Textbooks

Linear Algebra

Textbooks

Statistics

Textbooks

Theory of Computation

Textbooks

Computer Architecture

Textbooks

Resources

Algorithms and Data Structures

Textbooks

Operating Systems

Textbooks

Networking

Textbooks

Projects

Databases

Textbooks

Projects

Compilers

Textbooks

Resources

Distributed Systems

Textbooks

A Plan of Attack

Compiling to Assembly from Scratch

Crafting Interpreters

ChibiCC

Conclusion

POSIX and SUS

GCC vs MUSL

What Sacrifices were made?

How do you make static binaries?

Don\u2019t name your functions _init

getopt_long doesn\u2019t exist

Sysctl isn\u2019t standard

lstat has optional fields

NI_MAXHOST isn\u2019t defined

Getting the Toolchains

Don\u2019t name your functions `_init`

`getopt_long` doesn\u2019t exist

`NI_MAXHOST` isn\u2019t defined