When I was kid, growing up we had this Ukrainian-English-French dictionary, so now that I'm having my kids I went looking for a copy of the same book. It's sold by a person in Toronto and can be bought here.

While the publisher provides the dictionary online for free, the user experience is hindered by a slow server and an outdated interface. For those interested, you can still access it here.

I wanted to make a better experience. With a small Python script, I scraped the site's audio, images and data (hit locations of the audio). It was a lot of regex due to some of the data being in JavaScript that was embedded on the page.

The audio has a very noticeable hiss. Passing the audio through a high and low pass removed most of the hissing.['ffmpeg', '-i', currentFile, '-af', 'lowpass=3000,highpass=200,afftdn=nf=-25', newLocation])

For the image, I used upscayl, from their GitHub repo. I was able to set it up locally and run a batch to upscale the images. The upscaling worked surprisingly well, getting 4x the resolution and removing some of the JPG noise.

For the data and UI I had two objectives, make it responsive including if it's being viewed on a tablet show 2 images at a time, and have the audio trigger on click vs mouse over (to make it mobile friendly).

Overall this was a quick little experiment to play with scrapping and optimization and to make a book I would use.

Published on: 2023-11-12