It’s not uncommon to find writeups on the web describing how to preserve bash
history. Set a HISTFILE value which can differentiate your sessions (by pty,
user id, date, &c), and you’ve got some durability. Throw in some HISTSIZE
and you’ve got history for days.
I’ve been capturing my shell history for a couple years on my personal laptop.
My own swiped-from-all-over version of history preservation looks like this:
But as much fun as it is to fill storage volumes with the events of the past
for their own sake, often there’s some analysis shown. Using grep, wc, maybe
some awk action if the blogger / stackoverflow essayist / mastodon tooter
is of a certain vintage. I went a different way with it. My first sloppy run
at it:
Output looks like this:
Then I thought “what if I have shell history for 100 years? what then?”.
Because that’s a totally reasonable thing to plan for. Then, “what if bash
shell speaks natural language in 100 years?” So I took a run at
using a Python tokenizing library. Specifically,
nltk. No particular reason
beyond it having pretty good gJuice when I was hunting for options.
That implementation looks like this:
I’m not ignoring the timestamps here, so the token distribution heavily
favors ‘#’, but the actual command distribution ends up looking not too
different from the simplistic first word counting version:
So what’s the point of looking at this data? Well, right off the top, I sure
do seem to spend a lot of time checking what files exist in my ${CWD}. So
maybe I should do alias l=ls and reduce my keystrokes by half at stroke.
Or maybe set my ${PROMPT_COMMAND} to auto-list any directory I pushd into.
The possibilities for bikeshedding are endless, given the many places I could
optimize my command line environment using my own history.