This is the sort of thing you humans do for fun?

So, like many things I wind up doing, it all started when I saw a single picture on Twitter. In this case, it was a manga panel with a bunny-eared girl attributed to a manga named “Mimi Mix!”

Anyway, the drawing style was cute so I looked it up on a manga reader site and read the first three chapters or so, at which point I decided I was liking the story – a sort of fluffy slice-of-life story about a cafe with some heavy yuri elements, though I perhaps should more properly describe it as a heavily yuri story with some light cafe elements – but really disliking the experience of reading it online, what with all of the ads and the loading times and blah blah blah.

So I went to download it from any of the usual sites and quickly discovered that it just wasn’t out there, or at least wasn’t out there in any of the places I knew to look. In addition, it had never been licensed for translation so I couldn’t buy it from Kindle or iBooks and wasn’t even available through Bookwalker’s Japanese site, which is my go-to for buying Japanese-language manga and ebooks.

amazon.co.jp had the series available in paper form, with used volumes starting at one yen and no new copies available except at “collector’s” pricing. For a manga that was released in 2017 it seemed like it had already been completely forgotten.

So I went back to the manga reader site, and considered my options.

  1. I could just read it on the site, but that involved a really mediocre reading experience.
  2. I could download it image by image and then mash it back together into a .cbz file for iComics. This would work but would be far too manual.
  3. Safari gave me the option to save off an entire page – so, one chapter – as a .webarchive file. This seemed the best option, and I even found a command line to extract the contents of a .webarchive.

For the record, it’s “textutil -convert html (filename)”, thanks to https://badcoffee.club/how-to-extract-images-from-webarchive-file-using-terminal/ for the tip.

This gave me a folder of images (both comic pages and ad images) and html files. So that was a big advantage over the downloading an image at a time method. But it still wasn’t quite right.

A few hours of shell scripting later and I had a bash script that would take a folder full of .webarchive files and do its best to make a single manga volume out of them.

I’ll put that script here, with no guarantees or warranty. It works for me.

#!/bin/bash
#
# extract_webarchive - batch webarchive extractor, for saving manga
# Usage: extract_webarchive all - creates folders for each web archive and extracts contents
#	 extract_webarchive filename - extracts a single web archive
#	 extract_webarchive (filename or all) cleanup - try to remove all files other than pages, and make a single cbz
#
if [ ! -f *.webarchive ]
then
	echo "No webarchive files found in current directory"
	exit
fi

currentdir=${PWD##*/}
# thanks to https://stackoverflow.com/questions/1371261/get-current-directory-name-without-full-path-in-a-bash-script
if [ "$1" = "all" ];
then
	echo "Batch Conversion"
	for filename in *.webarchive
	do
	echo $filename
	filenamenoext=${filename%.*} 
	if [ -f "$filenamenoext".webarchive ];
	then
		mkdir "$filenamenoext"
		mv "$filename" "$filenamenoext"
		cd "$filenamenoext"
		textutil -convert html "$filename"
		if [ "$2" = "cleanup" ];
		then
			mkdir pages
			mv ?.jpg ??.jpg ???.jpg ????.jpg pages
			rm *
			mv pages/* .
			rmdir pages
		fi
		cd ..
	fi
	
	if [ "$2" = "cleanup" ];
	then	
		zip -r "$currentdir".cbz "$filenamenoext"/*
		rm "$filenamenoext"/*
		rmdir "$filenamenoext"
	fi
	done

else
	echo "Single File Conversion"

	filename="$1"
	echo $filename
	filenamenoext=${filename%.*} 
	if [ -f "$filenamenoext".webarchive ];
	then
		mkdir "$filenamenoext"
		mv "$filename" "$filenamenoext"
		cd "$filenamenoext"
		textutil -convert html "$filename"
		if [ "$2" = "cleanup" ];
		then
			mkdir pages
			mv ?.jpg ??.jpg ???.jpg ????.jpg pages
			rm *
			mv pages/* .
			rmdir pages
		fi
		cd ..
	fi
fi

That wasn’t QUITE enough, though. One of the things I’ve been trying to do, now that iPadOS has a halfway-decent file manager, is to try to use my iPad as a laptop replacement. It’s not quite there, but it does most of the things a traditional computer can do most of the time.

One thing it certainly can’t do is run shell scripts. But Safari on the iPad CAN save a web page as a .webarchive, and I could save it to a folder on iCloud drive that could then be read by my Mac Mini.

And from the Mac Mini, I could create a Folder Action through Automator to watch the iCloud Drive folder for new files and run my script against the folder. So now I can drop manga chapters into this folder one at a time and the script will build a .cbz file out of them, which I can then copy back into iComics on the iPad.

At some point, I even went back and finished reading the rest of the series that had started me down this path. It’s a fun read, and I challenge you to not have a goofy grin on your face throughout it, but I don’t think it’s one that really sticks with you once you’re done with it.

That’s not really the point, though. I just wanted to document my madness.

This entry was posted in anime, mac. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.