I am the alpha nerd

I do not make this claim lightly.

OK, I make it pretty lightly.  “Alpha Nerd” is a pretty high bar to hit, after all, and should really be reserved for people who hand roll binary patches for their custom linux kernels.

But I’m feeling pretty good about today’s project, so I will brag a bit.

I am fortunate enough to be married to a woman with very good taste in media.  She consumes an ungodly amount of manga and light novels from all over Asia, but in particular has been REALLY into Chinese stuff recently.

There’s been just one problem.  While people who translate Japanese content tend to just put it up for download, the culture is completely different when it comes to translations of Chinese content.  It’s much more common to see them published as individual blog posts, and they often disappear from the web.

Naturally, she wanted local copies of her favorites so they couldn’t just poof, and I thought I’d solved this last year when I found a scraper program that would download fan translations as ePubs.

I had not, to be clear, solved it.  There were still a lot of sites that it didn’t address.

She came up with a solution on her own for these – a program called Goodlinks which allows you to archive local copies of web pages.  It’s not very automated, though, and some of these novels are made up of hundreds of individual web pages.  So saving a single novel locally was a process of opening each of these pages, one at a time, and saving them.

A few months ago, my answer would have been “wow, that sounds rough” because I did not have a solution.

Today, I had a solution.

Well, Copilot helped.  Really, it did almost all of the work to start.

The first thing I asked Copilot for was a python script that could be passed a URL and that would return a list of all the links on the referenced page.  This was easy enough, but you couldn’t import it into Goodlinks.  Goodlinks WOULD take a bookmarks.html file, though, so I did some hacking at an exported bookmarks.html file until I figured out what format it wanted its URLs in.

For the record, it wants one link per line in the file – and for some reason, every one of them needs to be prefaced with <DT>

All of this was 100% Copilot, with a little “And could it do this instead?” from my side.

It didn’t take long.  Like, 20-30 minutes from “I can do this better” to “Here’s your completed python script!”

Thing is, though, handing someone a python script is not super helpful.  And my wife doesn’t really like to turn on her computer.  So I needed a solution that could work from a phone.

After considering a few options, I decided that I would set up an email address that would accept emails including a URL in the body of the email, throw that URL at the python script that Copilot had given me, and email the resultant html file back to the email address that the URL had come from.  And, because I enjoy self harm, I decided to do this on one of my Linux VMs.

My first assumption was that it would be easy to have an email client on Linux watch for emails of a specific format and send the emails to an external script for processing.  This turned out to be my first, but not my worst, assumption because… well, I guess if you’re a Linux guy you are expected to use webmail for stuff.  It took me a few clients before I stumbled on to Evolution, which lets you set up an email filter that will pipe the body of the email through an external command.

I was in business!  It turned out that it was actually really easy to take the output from Evolution and send it through a simple shell script to parse out the sender’s email address and the URL from the incoming email, and to put the URL through the Python script I’d generated earlier, and to…

…well, now I had an html file but I needed to mail it back.

I had THOUGHT that you could do this from Thunderbird, and it turns out that you can!

Almost.

Kinda.

Sorta.

Well.

…from a command line, you can tell Thunderbird to generate an email, and it will populate an email message, and then it will sit there and wait for you to manually click the Send button.  It won’t go that last step.  There are workarounds, of course, but they involve using desktop control software to simulate a mouse click on the pixel on the screen that should be over the Send button.

OK.  So how do I send an email from the command line?

Some googling led me to a program called sSMTP, and then I spent probably two hours just trying to get it to authenticate to a gmail account.  gmail has some pretty strong authentication requirements, though, and I could not figure out how to jump through all of the required hoops.

Thankfully, The ISP Formerly Known As Comcast isn’t quite as picky.  You need to go into your email settings and tell it to accept email from third party applications, but once you’ve done that you can use any email client.

Despite being able to authenticate, though, I still couldn’t get it to send an email.  This may be because, unbeknownst to me, I at some point managed to get my email flagged for spam by iCloud and so all of my test emails were being dumped into the ether.  It may also have been because sSMTP was deprecated!  We’ll never know which it was, because I eventually moved to a mail program named msmtp which is apparently the replacement for sSMTP.  That was the first point where I could actually send emails to myself from the command line, and where I thought I had really turned a corner…

…except I couldn’t attach a file.

Some further research, and I found that I would need to install an email client that understood how to MIME-encode an email and attach a file to it, and there’s one called “mutt” that will do this and will even use msmtp as the program to do the mail sending thing so all of the work it took me to get msmtp configured wouldn’t be wasted.

And I got that configured.  And I tried my script again.

And finally, after about six hours of staring at terminal windows and willing them to work, I got to a point where I could send myself an email, with a URL in the email, and Evolution would receive the email and send it off to my python script for parsing, and the script would download the referenced web page and make a bookmarks.html file out of it, and pass that to mutt, and mutt would bundle it up and send it to msmtp for mailing, and the bookmarks.html file would land in my inbox and could be easily imported into Goodlinks.

I mean, really it is just so obvious! I don’t know what took me so long.

I kinda don’t know whether I should actually brag about this or not.  I am fully prepared for someone to stumble across this and point me at a one-click solution for the whole dang thing.  Please be gentle if that someone is you.

 

For later reference, here are some of the sites I found to help me get through this whole nightmare:

https://arnaudr.io/2020/08/24/send-emails-from-your-terminal-with-msmtp/

https://linsnotes.com/posts/sending-email-from-raspberry-pi-using-msmtp-and-mutt/

https://hostpresto.com/tutorials/how-to-send-email-from-the-command-line-with-msmtp-and-mutt/

https://www.baeldung.com/linux/send-emails-from-terminal

 

This entry was posted in homelab, shell scripts. Bookmark the permalink.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.