This topic is a log of technical tasks @maiki documents.
I’m going to try to document the various tasks I accomplish. Each time I try to write a resume I come up blank, and now I’ve been doing this so long it is obvious a shortcoming on my part to not document, celebrate, internalize, and learn from my tasks on a different level.
Been testing a hybrid approach to translations, particularly with regards to long web forms with many single-word answers, for instance.
The interface integrates with a few machine translation services, with side-by-side viewing and direct field editing. Once a translator finishes their edits, the data is saved in a pool of pre-existing translation strings.
Our hope is this will speed up the process for everyone involved, as presenting the content in each language is tedious work and everyone is stretched thin.
My initial testing seems decent, but we’ll need to check with translators/native readers to see if it is worth the change…
Despite dealing with a lot (Daily log, maiki - #7 by maiki, Health log, maiki - #11 by maiki), I’ve been setting up a new process using machine-learning auto-translation tools to assist translators with the work of translating hypermedia documents (text, images, audio, video, forms, maps, etc…) for the public.
We had an okay system before, but the pandemic and other factors have introduced difficulties, such as training new translators on the system, even if they are translating a single piece of content. This new process will accommodate both our in-house translators working directly on the platform, as well as sending previews for vetting and a public-facing reporting system so anyone give us feedback.
The platform has many users adding content, and each dept. or campaign might have it’s own resources to contribute to translation (translators on the team, special budget, that sorta thing). This new system allows us to be more flexible and accept translation efforts in a variety of channels.
Which is good, the more we can push translation “behind the curtain” the less of a pain-point it (and it definitely is). But with this hybrid approach, we get the cost- and effort-savings from auto-translation, while ensuring the specialized topic (city planning and transportation information and feedback) is understandable by a human.
Switching gears a bit, I’ve built several survey forms for over 20 transportation strategies, which includes captioned images in the form itself. Took a bit of work, but all the HTML is proper, so none of the forms will break in strange and mysterious ways.
As I was entering in the questions I noted they were formulated in a reusable manner, which means our translation system will make for quick work, storing translated strings and auto-filling in where needed.
I’m designing a small feedback form to solicit information about a given page on a website. For instance, an issue with the translation or a layout issue, someone fills out this form and I’m am notified.
Including the URL from which the form is filled out is a common bit of metadata. Last night falling asleep I realized I could take an extra step here and help myself with an issue I specifically have: seeing the page as a site visitor.
I’m always logged in as an admin, or configure my user-agents in a particular way that makes it kinda difficult for me to see the site’s I’m maintaining as a visitor. Also, sometimes errors are intermittent; a layout may break because of a comment of a certain length, but as comments are always added that will change, for instance.
My plan: send the URL to the Wayback Machine for a time-stamped instance of the page, when the visitor is experiencing the issue.
So neat, ne?! This works because the economic models for the sites I build are aligned with integrity and fixing problems, rather than hiding problems and projecting a “brand”. And having a snapshot of the site as problems are found and shared with us for solutions is a great tool.
I’ve made a note to write a specific tutorial to myself on how and where I install Hugo. Currently it is at ~/.local/bin/hugo, and I think that is a rather, hmmm, “portable” place to have it. Same with Fossil.
Anyhow, after getting Hugo in place, I was able to generate a functional site with several XMPP Clients in a matter of minutes. Yay “muscle memory”!
Yesterday I built a complex survey full of conditional logic so folks are answering only questions that apply to them. Today going through the feedback and finishing it, but it means explaining privacy policies and accessibility and primacy/recency bias and mutlilingual considerations…
A client has a big push of content coming through, multiple surveys for online and print. Because of recent improvements to our platform translation system, it is now optimal to use the platform to translate everything, and then base printed materials from the translation!
Sitting under the redwoods (Daily log, maiki - #32 by maiki) I was relaxed and realized my issues with videos wasn’t an issue I had at all. As in, it’s an issue for others. The tools don’t work for me, and I don’t want to be responsible for hosting videos since it could end up costing me a lot of money personally. On the other hand, I don’t trust any org to host video for me, as the profit motive doesn’t align with my simple needs… and then I realized there was one org that could do it for me: the Internet Archive!
I just need a host that accepts metadata and can generate playlist occasionally, while also not being creepy by design. I don’t need the “social layer” of video, because I have talkgroup and other points-of-info on the network, where I can link to video assets… and done.
Over the weekend I came upon a tool I had never seen: Detwinner.
Detwinner is the simplest and the fastest tool for removing duplicate files from your Linux PC.
I ran it on my big data archive and cleared out 29GB! I thought there would be a lot of music and photos with duplicates, but that was hardly the case. It was large files that I’d stashed recursively. Like, three copies of a 800MB zip containing an album in FLAC…
And they have a set of criteria to bulk select which copies to keep, and the one I used was, “file with shortest path”, which worked great as I process data by going out to the furthest “leaf” node in my file “tree”, and try to put it closer to… well, root. I even have a script I use on directories to recursively check each one for files and removes them if they find none.
Anyhow, making a note, from their website I would have never gone for it, but that was the exact functionality I wanted.
Yesterday I hit a milestone by figuring out how to merge Hugo templates with data from Wikidata to create multilingual pages edited on wikidata.org. Currently I’m just drawing down company descriptions, but now I know, and the fun can begin…