Morning Links

The Cloud Under the Sea

The world’s emails, TikToks, classified memos, bank transfers, satellite surveillance, and FaceTime calls travel on cables that are about as thin as a garden hose. There are about 800,000 miles of these skinny tubes crisscrossing the Earth’s oceans, representing nearly 600 different systems, according to the industry tracking organization TeleGeography. The cables are buried near shore, but for the vast majority of their length, they just sit amid the gray ooze and alien creatures of the ocean floor, the hair-thin strands of glass at their center glowing with lasers encoding the world’s data.

Fantastic article, reading it I’m nostalgic for my time as a Sailor in the US Navy. If I was a younger man, I might be tempted to seek out a position on one of these ships.

Morning Links

Tackle complex reasoning tasks with Mistral Large, now available on Amazon Bedrock

Calculate the difference in payment dates between the two customers whose payment amounts are closest to each other in the given dataset:

‘{ “transaction_id”:{“0”:“T1001”,“1”:“T1002”,“2”:“T1003”,“3”:“T1004”,“4”:“T1005”}, “customer_id”:{“0”:“C001”,“1”:“C002”,“2”:“C003”,“3”:“C002”,“4”:“C001”}, “payment_amount”:{“0”:125.5,“1”:89.99,“2”:120.0,“3”:54.3,“4”:210.2}, “payment_date”:{“0”:“2021-10-05”,“1”:“2021-10-06”,“2”:“2021-10-07”,“3”:“2021-10-05”,“4”:“2021-10-08”}, “payment_status”:{“0”:“Paid”,“1”:“Unpaid”,“2”:“Paid”,“3”:“Paid”,“4”:“Pending”} }’

Output

To find the difference in payment dates between the two customers whose payment amounts are closest to each other, we first need to identify the two closest payment amounts and then find their respective payment dates.

The payment amounts are: 125.5, 89.99, 120.0, 54.3, 210.2

The two closest payment amounts are 125.5 and 120.0. The payment date for the payment amount 125.5 is ‘2021-10-05’ and for the payment amount 120.0 is ‘2021-10-07’.

The difference in payment dates can be calculated as follows:

Date difference = |‘2021-10-07’ - ‘2021-10-05’| = 2 days

So, the difference in payment dates between the two customers whose payment amounts are closest to each other is 2 days.

AWS does well when taking complex systems and packaging them up for consumers. I don’t think we’ve seen the full impact of AI yet. Not even close.

Casey Liss - Every Failure is an Opportunity

So, I called Sonos support this morning, and braced for a fight — I’m about six months out of the warranty coverage.

Thirty minutes later, without any sort of arguing nor complaining, I have a RMA, and I’ll get a new Roam in a couple of weeks.

Reminds me of a time when I took my newly broken iPod Mini into an Apple Store. I showed it to the guy at the counter, he said he couldn’t say for sure if it was accidental or manufacturing defect, so he went in the back and brought me a brand new one. I’ve been a fan since. Sounds like Sonos learned the right lesson.

New Spatial Personas

A pair of links for this one, one from each point of view:

I was able to invite my friend Stephen Hackett’s Persona over to my house for a play date and we were able to chat face to face in a way that just seemed more natural than talking to a persona in a box. It felt more like it was him.

After our excellent ideation session, we played a round of Battleship in the excellent Game Room on Apple Arcade. After I sunk all of his ships, we watched a few minutes of For All Mankind in a couple of immersive environments.

I had always wondered about SharePlay, prior to the Vision Pro. Why would I want to keep my phone up looking at someone on FaceTime while we both watched a movie? Never made sense. Now in the light of immersive virtual reality, it makes perfect sense.

The one about the web developer job market

We have the worst job environment for tech in over two decades and that’s with the “AI” bubble in full force. If that bubble pops hard before the job market recovers, the repercussions to the tech industry will likely eclipse the dot-com crash.

Well, this gave me a lot of food for thought.

xz Utils Backdoor

I simply don’t believe this was the only attempt to slip a backdoor into a critical piece of Internet software, either closed source or open source. Given how lucky we were to detect this one, I believe this kind of operation has been successful in the past. We simply have to stop building our critical national infrastructure on top of random software libraries managed by lone unpaid distracted—or worse—individuals.

Not great news the past couple days from the software industry.

Loading and Indexing SQLite

What a difference a couple of lines of code can make.

I recognize that databases have always been a weak point for me, so I’ve been trying to correct that lately. I have a lot of experience with management of the database engines, failover, filesystems, and networking, but too little working with the internals of the databases themselves. Early this morning I decided I didn’t know enough about how database indexes worked. So I did some reading, got to the point where I had a good mental model for them, and decided I’d like to do some testing myself. I figured 40 million records was a nice round number, so I used fakedata to generate 40 million SQL inserts that looked something like this:

INSERT INTO contacts (name,email,country) VALUES ("Milo Morris","pmeissner@test.tienda","Italy");INSERT INTO contacts (name,email,country) VALUES ("Hosea Burgess","kolage@example.walmart","Dominica");INSERT INTO contacts (name,email,country) VALUES ("Adaline Frank","shaneIxD@example.talk","Slovenia");

I saved this as fakedata.sql and piped it into sqlite3 and figured I’d just let it run in the background. After about six hours I realized this was taking a ridiculously long time, and I estimated I’d only loaded about a quarter of the data. I believe that’s because SQLite was treating each INSERT as a separate transaction.

A transaction in SQLite is a unit of work. SQLite ensures that the write to the database is Atomic, Consistent, Isolated, and Durable, which means that for each of the 40 million lines I was piping into sqlite3, the engine was ensuring that every line was fully committed to the database before moving on to the next line. That’s a lot of work for a very, very small amount of data. So, I did some more reading and found one recommendation of explicitly wrapping the entire load into a single transaction, so my file now looked like:

BEGIN TRANSACTION;INSERT INTO contacts (name,email,country) VALUES ("Milo Morris","pmeissner@test.tienda","Italy");INSERT INTO contacts (name,email,country) VALUES ("Hosea Burgess","kolage@example.walmart","Dominica");INSERT INTO contacts (name,email,country) VALUES ("Adaline Frank","shaneIxD@example.talk","Slovenia");COMMIT;

I set a timer and ran the import again:

➜  var time cat fakedata.sql| sqlite3 test.dbcat fakedata.sql  0.07s user 0.90s system 1% cpu 1:13.66 totalsqlite3 test.db  70.81s user 2.19s system 98% cpu 1:13.79 total

So, that went from 6+ hours to about 71 seconds. And I imagine if I did some more optimization (possibly using the Write Ahead Log?) I might be able to get that import faster still. But a little over a minute is good enough for some local curiosity testing.

Indexes

So… back to indexes.

Indexing is a way of sorting a number of records on multiple fields. Creating an index on a field in a table creates another data structure that holds the field values and a pointer to the record it relates to. Once the index is created it is sorted. This allows binary searches to be performed on the new data structure.

One good analogy is the index of a physical book. Imagine that a book has ten chapters and each chapter has 100 pages. Now imagine you’d like to find all instances of the word “continuum” in the book. If the book doesn’t have an index, you’d have to read through every page in every chapter to find the word.

However, if the book is already indexed, you can find the word in the alphabetical list, which will then have a pointer to the page numbers where the word can be found.

The downside to the index is that it does take additional space. In the book analogy, while the book itself is 1000 pages, we’d need another ten or so for the index, bringing up the total size to 1010 pages. Same with a database, the additional index data structure requires more space to hold both the original data field being indexed, and a small (4-byte, for example) pointer to the record.

Oh, and the results of creating the index are below.

SELECT * from contacts WHERE name is 'Hank Perry';Run Time: real 2.124 user 1.771679 sys 0.322396CREATE INDEX IF NOT EXISTS name_index on contacts (name);Run Time: real 22.129 user 16.048308 sys 2.274184SELECT * from contacts WHERE name is 'Hank Perry';Run Time: real 0.003 user 0.001287 sys 0.001598

That’s a massive improvement. And now I know a little more than I did.

The Perfect ZSH Config

If you spend all day in the terminal like I do, you come to appreciate it’s speed and efficiency. I often find myself in Terminal for mundane tasks like navigating to a folder and opening a file; it’s just faster to type where I want to go than it is to click in the Finder, scan the folders for the one I want, double-click that one, scan again… small improvements to the speed of my work build up over time. The speed is increased exponentially with the correct configuration for your shell, in my case, zsh.

zsh is powerful and flexible, which means that it can also be intimidating to try to configure yourself. Doubly-so when there are multiple ‘frameworks’ available that will do the bulk of the configuration for you. I used Oh My Zsh for years, but I recently abandoned it in favor of maintaining my own configuration using only the settings that I need for the perfect configuration for my use.

I’ve split my configuration into five files:

  • apple.zsh-theme
  • zshenv
  • zshrc
  • zsh_alias
  • zsh_functions

I have all five files in a dotfiles git repository, pushed to a private Github repository.

The zshenv file is read first by zsh when starting a new shell. It contains a collection of environmental variables I’ve set, mainly for development. For example:

export PIP_REQUIRE_VIRTUALENV=trueexport PIP_DOWNLOAD_CACHE=$HOME/.pip/cacheexport VIRTUALENV_DISTRIBUTE=true

The next file is zshrc, which contains the main bulk of the configurations. My file is 113 lines, so let’s take it a section at a time.

source /Users/jonathanbuys/Unix/etc/dotfiles/apple.zsh-themesource /Users/jonathanbuys/Unix/etc/dotfiles/zsh_aliassource /Users/jonathanbuys/Unix/etc/dotfiles/zsh_functions

The first thing I do is source the other three files. The first is my prompt, which is cribbed entirely from Oh My Zsh. It’s nothing fancy, but I consider it to be elegant and functional. I don’t like the massive multi-line prompts. I find them to be far too distracting for what they are supposed to do.

My prompt looks like this:

 ~/Unix/etc/dotfiles/ [master*] 

It gives me my current path, what git branch I’ve checked out, and if that branch has been modified since the last commit.

The next two files, as their names suggest, contain aliases and functions. I have three functions and 16 aliases. I won’t go into each of them here, as they are fairly mundane and only specific for my setup. The three functions are to print the current path of the open Finder window, to use Quicklook to preview a file, and to generate a uuid string.

The next few lines establish some basic settings.

autoload -U colors && colorsautoload -U zmvsetopt AUTO_CDsetopt NOCLOBBERsetopt SHARE_HISTORYsetopt HIST_IGNORE_DUPSsetopt HIST_IGNORE_SPACE

The autoload lines setup zsh to use pretty colors, and to enable the extremely useful zmv command for batch file renaming. The interesting parts of the setopt settings are the ones dealing with command history. These three commands allow the sharing of command line history between open windows or tabs. So if I have multiple Terminal windows open, I can browse the history of both from either window. I find myself thinking that the environment is broken if this is not present.

Next, I setup some bindings:

  # start typing + [Up-Arrow] - fuzzy find history forward  bindkey '^[[A' up-line-or-search  bindkey '^[[B' down-line-or-search    # Use option as meta  bindkey "^[f" forward-word  bindkey "^[b" backward-word    # Use option+backspace to delete words  x-bash-backward-kill-word(){      WORDCHARS='' zle backward-kill-word    }  zle -N x-bash-backward-kill-word  bindkey '^W' x-bash-backward-kill-word    x-backward-kill-word(){      WORDCHARS='*?_-[]~\!#$%^(){}<>|`@#$%^*()+:?' zle backward-kill-word  }  zle -N x-backward-kill-word  bindkey '\e^?' x-backward-kill-word

These settings let me use the arrow keys to browse history, and to use option + arrow keys to move one word at a time through the current command, or to use option + delete to delete one word at a time. Incredibly useful, use it all the time. Importantly, this also lets me do incremental searching through my command history with the arrow keys. So, if I type aws, then arrow up, I can browse all of my previous commands that start with aws. And when you have to remember commands that have 15 arguments, this is absolutely invaluable.

The next section has to do with autocompletion.

# Better autocomplete for file namesWORDCHARS=''unsetopt menu_complete   # do not autoselect the first completion entryunsetopt flowcontrolsetopt auto_menu         # show completion menu on successive tab presssetopt complete_in_wordsetopt always_to_endzstyle ':completion:*:*:*:*:*' menu select# case insensitive (all), partial-word and substring completionif [[ "$CASE_SENSITIVE" = true ]]; then  zstyle ':completion:*' matcher-list 'r:|=*' 'l:|=* r:|=*'else  if [[ "$HYPHEN_INSENSITIVE" = true ]]; then    zstyle ':completion:*' matcher-list 'm:{[:lower:][:upper:]-_}={[:upper:][:lower:]_-}' 'r:|=*' 'l:|=* r:|=*'  else    zstyle ':completion:*' matcher-list 'm:{[:lower:][:upper:]}={[:upper:][:lower:]}' 'r:|=*' 'l:|=* r:|=*'  fifiunset CASE_SENSITIVE HYPHEN_INSENSITIVE# Complete . and .. special directorieszstyle ':completion:*' special-dirs truezstyle ':completion:*' list-colors ''zstyle ':completion:*:*:kill:*:processes' list-colors '=(#b) #([0-9]#) ([0-9a-z-]#)*=01;34=0=01'zstyle ':completion:*:*:*:*:processes' command "ps -u $USERNAME -o pid,user,comm -w -w"# disable named-directories autocompletionzstyle ':completion:*:cd:*' tag-order local-directories directory-stack path-directories# Use caching so that commands like apt and dpkg complete are useablezstyle ':completion:*' use-cache yeszstyle ':completion:*' cache-path $ZSH_CACHE_DIRzstyle ':completion:*:*:*:users' ignored-patterns \        adm amanda apache at avahi avahi-autoipd beaglidx bin cacti canna \        clamav daemon dbus distcache dnsmasq dovecot fax ftp games gdm \        gkrellmd gopher hacluster haldaemon halt hsqldb ident junkbust kdm \        ldap lp mail mailman mailnull man messagebus  mldonkey mysql nagios \        named netdump news nfsnobody nobody nscd ntp nut nx obsrun openvpn \        operator pcap polkitd postfix postgres privoxy pulse pvm quagga radvd \        rpc rpcuser rpm rtkit scard shutdown squid sshd statd svn sync tftp \        usbmux uucp vcsa wwwrun xfs '_*'if [[ ${COMPLETION_WAITING_DOTS:-false} != false ]]; then  expand-or-complete-with-dots() {    # use $COMPLETION_WAITING_DOTS either as toggle or as the sequence to show    [[ $COMPLETION_WAITING_DOTS = true ]] && COMPLETION_WAITING_DOTS="%F{red}…%f"    # turn off line wrapping and print prompt-expanded "dot" sequence    printf '\e[?7l%s\e[?7h' "${(%)COMPLETION_WAITING_DOTS}"    zle expand-or-complete    zle redisplay  }  zle -N expand-or-complete-with-dots  # Set the function as the default tab completion widget  bindkey -M emacs "^I" expand-or-complete-with-dots  bindkey -M viins "^I" expand-or-complete-with-dots  bindkey -M vicmd "^I" expand-or-complete-with-dotsfi# automatically load bash completion functionsautoload -U +X bashcompinit && bashcompinit

That’s a long section, but in a nutshell this lets me type one character, then hit tab, and be offered a menu of all the possible completions of that character. It is case-insensitive, so b would match both boring.txt and Baseball.txt. I can continue to hit tab to cycle through the options, and hit enter when I’ve found the one I want.

The last section sources a few other files:

[ -f ~/.fzf.zsh ] && source ~/.fzf.zsh[ -f "/Users/jonathanbuys/.ghcup/env" ] && source "/Users/jonathanbuys/.ghcup/env" # ghcup-env[ -s "/Users/jonathanbuys/.bun/_bun" ] && source "/Users/jonathanbuys/.bun/_bun"source /Users/jonathanbuys/Unix/src/zsh-autosuggestions/zsh-autosuggestions.zshsource /Users/jonathanbuys/Unix/src/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh

If I’m experimenting with Haskell, I’d like to load the ghcup-env variables. If I have bun installed (a way, way faster npm), than use that. The final two sources are for even more enhanced autosuggestions and command line syntax highlighting. So, typos or commands that don’t exist will be red, good commands where zsh can find the executable will be green. The autosuggestions take commands from my history and suggest them, I can type right-arrow to accept the suggestion, or keep typing to ignore it.

Taken together, I’ve been able to remove Oh My Zsh, but keep all of the functionality. My shell configuration is constantly evolving as I find ways to make things faster and more efficient. I don’t consider myself a command line zealot, but I do appreciate how this setup gets out of my way and helps me work as fast as I can think.


p.s. A lot of this configuration was taken from other sources shared around the internet, as well as the zsh documentation. I regret that I haven’t kept references to the origins of some of these configs. If I can find the links I’ll post them here.

Future Work and AI

I’ve been trying to wrap my small monkey brain around what ChatGPT will mean in the long run. I’m going to try to think this through here. In many ways the advances we’ve seen in AI this past year perpetuate the automation trend that’s existed since… well, since humans started creating technology. I’ve seen arguments that seem to be on two ends of a spectrum, that the AI is often wrong and unreliable, and we shouldn’t use it for anything important, to AI is so good that it’s going to put us all out of jobs. As with most truths, I think the reality is somewhere in between.

It’s my opinion that jobs that AI can replace, it probably will replace a lot of. But not all. Referring back to our discussion about the current state of Apple news sites, if the site is a content farm pumping out low-value articles for hit counts and views, I can see AI handling that. If your site is well thought out opinions and reviews about things around the Apple ecosystem, that I think will be safe. Because it’s the person’s opinion that gives the site value.

For more enterprise-y jobs, I could see fewer low and mid-level developers. Fewer managers, fewer secretaries, fewer creatives. Not all gone, but certainly less than before. If your job is to create stock photos and put together slide shows, you might want to expand your skill set a bit.

I think… the kind of jobs that will survive are the type that bring real value. The kind of value that can’t be replicated by a computer. Not just the generation of some text or code, but coming up with the why. What needs to be made, and why does it need to be made?

Maybe AI will help free us up to concentrate on solving really hard problems. Poverty, clean water, famine, climate change. Then again, maybe it’ll make things worse. I suppose in the end that’s up to us.

Solar-powered system offers a route to inexpensive desalination - MIT News

Now, a team of researchers at MIT and in China has come up with a solution to the problem of salt accumulation — and in the process developed a desalination system that is both more efficient and less expensive than previous solar desalination methods. The process could also be used to treat contaminated wastewater or to generate steam for sterilizing medical instruments, all without requiring any power source other than sunlight itself.

That’s good news, more of this please.

Link

Coding as a greybeard - Hacker News

“I’m 51 and I’ve been active in this industry since I was 14. I watched it grow from computers with 4k of memory to having a supercomputer in my pocket. I was learning in the age of Apple II and the Commodore PET. When I realized that I could create an explosion of data with just a few lines of code, I was hooked forever. It was such a magical thing. I found some other guys in my high school that were also into computers and we started meeting regularly on Fridays and Saturdays to… Well, to do some things that were, perhaps, not allowed. Since then, I’ve started three companies, and I don’t think I could have found the same satisfaction in any other industry. I am mindful, these days, that I’m 51 because I know ageism is a thing in tech. There’s a moment when you walk into a room and people think, ‘Oh, he’s a greybeard.’ I don’t have a beard, but you know what I mean. But when I start to talk about things and find solutions, that disappears. I can’t change my age but I am in full control over what I do and what I read and how much time I carve out to write code. I can still see myself doing this when I’m 60, 70 years old. Even older. Because I want to keep doing meaningful things.”

Interesting thread, especially for those of us with an increasing amount of grey.

Link

Use One Big Server - Speculative Branches

We have all gotten so familiar with virtualization and abstractions between our software and the servers that run it. These days, “serverless” computing is all the rage, and even “bare metal” is a class of virtual machine. However, every piece of software runs on a server. Since we now live in a world of virtualization, most of these servers are a lot bigger and a lot cheaper than we actually think.

Link