o< cuisine migration

clafoutisFor once I’m blogging in English here, because I’d expect most people who’d actually have any interest in this entry to also read English 🙂

I’ve been wanting to move o< cuisine to another place for a while now. The main issue is that I’m a very lazy sysadmin, and consequently maintaining a WordPress on a personal server was not necessarily the bestest of ideas, especially since it’s a blog I don’t write on that often, and so I don’t see the blinking update notifications as often as I should. So moving to hosted wordpress.com platform made sense (I had happily moved this one a few months ago, and I could definitely live with that.)

Now the issue is that my WordPress install is somewhat peculiar. In the past, I’ve been using a plugin to display photos from Gallery (hosted on the same server): thankfully, this one had already been taken care of during the dotclear-to-wordpress migration, so I had reasonably clean URLs to work with. More problematically, I was also using the shashin plugin to display pictures from Picasa/Google+/Google Photos/the current iteration of that thing. And there, the shashin plugin was actually keeping its own data and not replacing with proper links (which completely makes sense given the use case) – so instead of pictures, I had stuff like [simage=], that were transformed when rendering the corresponding blog entry, and not statically. So I knew from the start that this would not work with wordpress.com, where plugins are essentially a no-no. This is the main reason why I hadn’t migrated yet. The other reason was that I had a neat little thing to add recettes.de tags at the end of my entries, and that I’d have to say goodbye to it as well. Anyway, it was not simply a matter of exporting/re-importing, I knew I’d have to process the exported result quite a bit before being able to import it as I wanted on the new platform.

Important preliminary caveat

I’m going to link a few scripts here. I consider these scripts to be very much throw-away-execute-once-code, they are VERY ugly (I’m not a Python coder, and yes, I’m parsing XML with regexps, deal with it, and my dealing with text encoding is very much on the YOLO side – it works until it doesn’t). I hesitated a while before actually adding them to this article, especially with the whole « throw-away » concept; on the other hand, maybe, just maybe, it may be of some use as an example to someone, and if I don’t provide them now then I might get some requests for them at a less convenient time. I’m also fairly sure that if I provide them, no-one will have a look at them, and if I don’t, someone will ask. So there. Note: I’m not looking for code reviews here, there’s a LOT of things I’d improve on these things if I cared, but I really don’t, so… 🙂 Also, I’m not guaranteeing that they won’t have catastrophic results if you decide to execute them for the lulz. So don’t do that, probably.

Step 0: evaluate the damage

So I knew it wouldn’t be enough to export/re-import. First thing I did was to actually do that, though. I exported the blog from my own WordPress, I re-imported it into a dummy WordPress blog. As expected, I lost most of the images. The good surprise, though, was that the images that were actually stored on WordPress were exported and re-imported as well. Other than that, nothing much to say, but most importantly, no bad surprise.

First things first: the Gallery URLs

The easy part of the migration had to do with the fact that I had a bunch of image URLs that were pointing to the gallery with a relative URL. So that wouldn’t fly.

There, the approach was very much brute-force, and pretty much something like

sed "s+/gallery/+http://cuisine.palats.com/gallery/+g" export.xml > export-with-proper-gallery-links.xml

After checking that this did everything I wanted it to do and only what I wanted it to do, time for the next step.

Next: removing shashin tags

So as I was mentioning, the main issue that I had with the images was that my images were simply encoded with an ID, and that that ID was associated to the picture data themselves in the database. So time to brush up on the use of the mysql console. I started by a view tables; to have an idea of where that information would actually be stored; then I checked that my understanding of the storage was correct with a few queries like select * from wp_shashin_photo where id=2203;. Whereas it technically doesn’t prove anything, it was good enough for me 😉

Time for a little bit of Python. Fetch IDs and corresponding URLs, search what I want to replace, replace it, be happy. I put the script in question here; please re-read the above caveat if it’s not burnt into your brain yet.

After I ran that script, I still had a few tags that weren’t taken into account by my splendid processing engine; since there were 3 or 4 of them, I edited them by hand, re-ran the script, and voilà.

While I’m at it: getting rid of Gallery

At that point, I felt pretty much ready to flip the switch and migrate the whole thing. And then when discussing with my husband, it was mentioned that it also would be reaaaally nice if I could actually get rid of the Gallery dependency as well. This could clearly not be done by hand: there was a bit more than 2000 pictures on that gallery, so it was a non-trivial task.

I looked at my options to host pictures somewhere else: I essentially needed the new host to allow me to upload the pictures, keep a track of what URL they were available in, and replace the old Gallery links by the new ones. Easy-peasy. I first had a look at Google Photos – while it’s all fancy and nice, the complete lack of API made the thing not an option. I vaguely looked at options via Drive, and I ended up deciding against it – too complicated, too random. On the other hand, Flickr has a reasonable API, a terabyte of space, and everything’s nice and shiny. So I did a few tests with flickr_api, decided it would work well enough, and started the whole thing.

I went with a three-step approach:

  • grabbing all that looked like a gallery URL, fetch the corresponding full-res image from the gallery, store on disk along with a URL/filename matching file,
  • upload all to Flickr, making a note of where Flickr was actually storing my stuff
  • replace the gallery URL with the Flickr ones.

Again, easy-peasy.

The first step was the easiest one: I just needed to grep the Gallery URLs, generate the correct URL that would fetch the right picture, download, store in non-stupid place on disk, update log of operations. Of course, it wasn’t exactly as simple as that – I had a few URLs deviating from the usual pattern and I had to tweak the script a bit to take them into account. Said script is here; again, beware, here be dragons.

So at the end of this operation, I had a directory full of pictures and a file matching the gallery html code to said files. As a sanity check, I looked for files less than 10K, which would more probably be server error messages than my images, and found none. So far, so good.

Then came the upload to Flickr. No rocket surgery there either – take the pictures, upload to Flickr via Magic API, log where they end up, and done. Except that when I started the script, I realized that uploading a picture would actually take several seconds, possibly up to ten. The very annoying thing there was that 2000 pictures, 10 seconds apiece, is essentially 5 hours. The whole process was almost trivially parallelizable (instead of uploading one picture at a time, start 15 threads that do the same thing), but I knew nothing about Python parallelization primitives, and even then, my concurrent programming skills are somewhat rusty. (There was a hilarious moment before I resigned myself to the fact that yes, I needed to synchronize the access to stdout, and that it would NOT generally be okay.) All in all, I decided that I’d probably put these 5 hours to better use learning how to do that thing instead of waiting for it to be done, so I did that. And in the end, after a bit of fiddling, I ended up with a working solution – which is here. It didn’t upload everything on the first run, because sometimes Flickr’s API is, well, flickery, but since I was logging everything and not re-uploading stuff that had already been uploaded, well, I just ran it a few times and in the end everything was uploaded. Flickery API also meant that I had to actually treat exceptions that would happen, because otherwise my 15-thread script would actually end up quite quickly in a three, two, one, zero active thread script. The more you know.

Because of previous mishaps, I was a bit wary of the fact that maybe I’d logged stuff that wasn’t actually properly uploaded. So I wrote a small verification script, which I also happened to multi-thread, because now that I knew how it went, it was easy 🙂 – here it is.

And finally, last but not least, I needed to replace my Gallery URLs with Flickr URLs. I actually made more calls to the API in this script: when uploading the images, I was getting and logging the ID, but I needed another query to get the image in the correct size. That could probably have been done in a previous step, but oh well. Last script for today here.

And at the end of that operation, I finally had a full import XML for WordPress, with all my images where I wanted them, so yay. I imported it on a fresh WordPress.com url, and there, the new o< cuisine is alive!

Tying loose ends: DNS, Apache and all these internets

Now the thing is, o< cuisine on its previous address was, well, not exactly known, but you could find it on the internets. So I kind of wanted to keep the old links working. Except I couldn’t really, because the URL had a subfolder in it, which is not supported by wordpress.com as a valid redirection. So I set up a new subdomain on my domain, set it up as primary domain for my WordPress blog, added a Redirect 301 "/coinblog/" "http://coincuisine.pasithee.fr/" in my current Apache configuration on the old domain, and voilà.

Conclusion: what I (re-)learned

  • I actually enjoyed the whole thing a lot, to my own surprise. It was fairly nice to hack something into submission and in the end to have a working result. I should do that sort of things more often 😉
  • I learnt a bit about Flickr API – always a nice-to-know.
  • I learnt a bit about Python concurrency primitives, that’s actually very useful.
  • I have a new favorite grep option, « -o », as in grep -o '.\{0,10\}gallery.\{0,10\}' oltcuisine.wordpress.xml – it displays the 10 characters before and after the match. I actually used that thing 10 times today at work (yes, really.)

Oh, and the image illustrating this blog is obviously comes from o< cuisine: Clafoutis.

Annonce de service : RSS Google+

Ce post est purement informatif pour les gens qui aiment bien les RSS (moi), qui veulent voir les conneries (le plus souvent en anglais) que je raconte sur Google+ en public et qui ont quelque chose contre le fait que Google+ soit accessible essentiellement uniquement avec l’interface web de Google+ (et une API qui arrête pas de faire les pieds au mur, il paraît).

Je viens de mettre en place un flux RSS de mon G+ public chez un gens qui s’appelle gplusrss.com ; faites-en bon usage (ou pas). Ça marchera le temps que ça marchera avec la qualité de service que ça a, mais c’est censé marcher pour l’instant. Voilà.

Fin de l’annonce de service à caractère égocentrique.

Renommage de dossiers de photos

Je me souvienais avoir tweeté cette ligne de Bash il ya a quelque temps. En ayant besoin aujourd’hui, je l’ai donc cherchée. Le truc de recherche de Twitter craint assez, donc j’ai pris l’option bourrine mais néanmoins couronnée de succès : j’ai collé le RSS de mon Twitter dans Google Reader et fait une recherche dans Google Reader. Même pas dur 😉

Du coup, pour la prochaine fois que je la cherche, la voici :

for i in *_*; do mv $i 2010-`expr substr $i 7 2`-`expr substr $i 5 2`; done

Pour la sémantique, ben… mon appareil photo Pentax enregistre les photos sous un dossier XXX_jj-mm où XXX est un numéro incrémenté pour chaque jour de prise de vue, en commençant à 100, et en commençant à 100 à chaque fois que la carte mémoire est vidée, jj le jour, mm le mois. Je renomme donc mes dossiers de photo en 2010-mm-jj (et faudra que j’update la ligne dans quelques mois, dur.)

Ya probablement plus élégant, mais ça ça marche 😉

J’aurais probablement aussi pu ouvrir le manuel de l’appareil photo pour voir s’il y avait moyen de modifier le nom des répertoires créés.


Srsly, 3 mois ? et considérations pythonesques sur l’EXIF

Je me rends compte avec horreur que ça fait presque trois mois que j’ai pas bloggé ici. C’est affreux. Rattrapons cela immédiatement.

Faut dire, j’ai été occupée ailleurs :

… débordée, quoi.

Bref. Au fait, bonne année, tout ça, on est encore en janvier, l’honneur est sauf.

Si je me décide à reprendre le clavier aujourd’hui, c’est pour partager un tout ptit bout de code très laid. Contexte : j’envisage fortement l’achat d’un truc un poil plus lumineux que mes objectifs courants pour le Pentax. Et comme je veux pas y passer les deux bras (un seul suffira), ça implique presque mécaniquement la focale fixe. Et comme je suis indécise, j’hésite entre deux focales : la 35 et la 50. Par conséquent, j’ai codouillé un ptit truc dans un coin pour savoir si j’avais plutôt tendance à tourner autour du 35 ou plutôt autour du 50. Me suis limitée à la cuisine, parce que c’est quand même là que je prends une majorité de photos, et si je prends un objectif qui me va pas à la cuisine, ça va me déprimer je le sens.

Attention, c’est (comme d’habitude) probablement très laid et strictement non flexible. Mais ça fait à peu près ce que je veux, donc si ça peut servir à quelqu’un… (je dois quand même pas être la seule à me poser ce genre de questions ? si ? bon.). C’est du Python, obviously, parce que le Python, c’est bien.

 from PIL import Image from PIL.ExifTags import TAGS import os def get_exif(fn):     ret = {}     i = Image.open(fn)     info = i._getexif()     for tag, value in info.items():         decoded = TAGS.get(tag, tag)         ret[decoded] = value     return ret path = "/home/isa/Photos/cuisine" range35 = 0 range50 = 0 for root, dir, fnames in os.walk(path):   for fname in fnames:     fname = root + "/" + fname     if os.path.basename(fname).lower().endswith("jpg"):       try:         exif = get_exif(fname)         model = exif["Model"]         if(model.startswith("PENTAX K-m")):           (num1, num2) = get_exif(fname)["FocalLength"]           focal = num1/num2           if(focal >= 31 and focal <= 39):             range35 = range35 +1           elif(focal >= 46 and focal <= 54):             range50 = range50 + 1       except:         pass print "Range autour de 35 :" print range35 print "Range autour de 50 :" print range50 

Et, pour ceux que ça intéresse, le résultat est sans appel :

 Range autour de 35 : 951 Range autour de 50 : 437 

Sur le 35 j’ai « le choix » entre la 1.4 de Sigma et la 2.8 macro de Pentax… ça sera probablement le macro, la mise au point à 40cm+ sur le Sigma me fait un peu peur. Ça m’emmerde un peu, au sens où je gagne pas tant que ça en luminosité. Bref… la photo est un domaine de frustration permanent :p (à moins d’être très riche et d’avoir un dromadaire pour porter le matériel).

PS : je vient de passer, avec un peu de retard, sur le Gculicious, ressource précieuse s’il en est, et d’y trouver un toolaize de stats EXIF pour compte Flickr. Pour le coup, j’ai à peu près le même nombre de photos dans les deux catégories… dameunède :p

Encore un nouveau WM : Xmonad

Ça va tourner à la monomanie : je viens encore de finir de configurer un nouveau window manager. Après avoir grogné sur les divers plantages de wmii (jamais pris le temps d’élucider) et l’instabilité pathologique de la configuration d’Awesome (celui-là j’y reviendrai ptêt dans un an ou deux quand la conf aura fini de péter à chaque update – c’est sérieusement pénible), me voici sous Xmonad. Pas grand chose à en dire : c’est un tiling window manager – j’y ai goûté, je ne veux rien d’autre. Particularité amusante : il est écrit en Haskell. Le peu que j’ai fait d’Haskell, j’aimais bien, faudra que je m’y remette à l’occasion 😉

Donc, j’ai installé sur mon Arch (et l’ubuntu du laptop) xmonad (attention, sous Arch au moins, les dépendances me semblent un peu calculées à la hache, il installe tout Haskell et tout le bordel qui va avec, pour un total de 500M sur disque…), trayer (pour avoir une tite zone de notif, c’est souvent pratique) et xmobar, pour avoir une barre d’infos.

Ensuite, la config :

/home/isa/.xmonad/xmonad.hs :

 import XMonad import XMonad.Hooks.ManageDocks import XMonad.Hooks.EwmhDesktops import XMonad.Layout.NoBorders import XMonad.Util.Run(spawnPipe) import XMonad.Hooks.DynamicLog import System.IO import qualified Data.Map as M import qualified XMonad.StackSet as W import Data.Bits ((.|.)) main = do 	xmproc <- spawnPipe "/usr/bin/xmobar /home/isa/.xmobarrc" 	xmonad $ defaultConfig {         terminal = "xterm -fg white -bg black -font -*-terminus-medium-r-*-*-20-*-*-*-*-*-iso10646-1 -u8",         modMask = mod4Mask,         keys = c -> mykeys c `M.union` keys defaultConfig c,         manageHook = manageDocks <+> manageHook defaultConfig,         logHook = dynamicLogWithPP $ xmobarPP                         { ppOutput = hPutStrLn xmproc                         , ppTitle = xmobarColor "green" "" . shorten 50                         }, 	layoutHook = smartBorders $ ewmhDesktopsLayout $ avoidStruts $ layoutHook defaultConfig,         workspaces = map show [1..9],         focusedBorderColor = "#729fcf",         normalBorderColor = "#aaaaaa",         borderWidth = 2 } where         mykeys conf@(XConfig {XMonad.modMask = modMask}) = M.fromList $                 [((modMask, xK_b), sendMessage ToggleStruts),                  ((modMask, xK_semicolon), sendMessage (IncMasterN (-1)))]                 ++                 [((m .|. modMask, k), windows $ f i)                     | (i, k) <- zip (XMonad.workspaces conf) [0x22,0xAB,0xBB,0x28,0x29,0x40,0x2B,0x2D,0x2F],                     (f, m) <- [(W.greedyView, 0), (W.shift, shiftMask)]] 

ATTENTION c’est une config pour clavier bépoè : ça se voit à la ligne 0x22,0xAB,0xBB,0x28,0x29,0x40,0x2B,0x2D,0x2F qui définit le mapping des bureaux (c’est les codes hexa de la rangée de « chiffres », soit « «»()@+-/).

Le .xmobarrc, assez minimal :

 Config { font = "-misc-fixed-*-*-*-*-15-*-*-*-*-*-*-*"        , bgColor = "black"        , fgColor = "grey"        , position = TopW L 90        , lowerOnStart = True        , commands = [ Run Date "%a %b %_d %Y %H:%M:%S" "date" 10 		    , Run StdinReader                     ]        , sepChar = "%"        , alignSep = "}{"        , template = "%StdinReader% }{ <fc=#ee9a00>%date%</fc>"        } 

J’ai bricolé un peu le /usr/share/xsessions/xmonad.desktop pour qu’il appelle un xmonad.script plutôt que directement xmonad. Le script xmonad.script ressemble à ça :

 trayer --edge top --align right --SetDockType true --SetPartialStrut true   --expand true --width 10 --transparent true --tint 0x000000 --height 19 & xsetbg -center -border black /home/isa/screen.jpg exec xmonad 

(on remarquera que j’ai maintenant même un fond d’écran, le luxe absolu !). Bon, et niveau explications, la conf en question est très largement inspirée de celle de John Goerzen, documentée ici.

Voilàààààààà !

Tout arrive, même les claviers.

Depuis le temps qu’on l’attendait : le Typematrix 2030 (et sa skin bépo) est dans les bacs ! Voici quelques photos qui, paraît-il, valent mieux qu’un long discours.

La bestiole en QWERTY :

La même avec sa skin/chaussette bépoè :

Et la chaussette à moitié mise :

Bon, au niveau ressenti, il faut clairement s’adapter : à l’heure actuelle je me banane encore pas mal sur la rangée basse gauche du clavier – ce qui me semble assez normal vu comment elle est placée sur un clavier « décalé ». D’autre part j’avais plutôt l’habitude de taper le k de la main droite, mais bon, paraît que c’est mal™.

Bref, globalement, je pense qu’on verra à l’usage. Mais pour l’instant l’impression est pas mauvaise : le toucher et le contact sont plutôt agréables, y compris avec la chaussette.

Un mini patch très laid pour Freevo

Mon occupation de ce dimanche après-midi a été de bricoler un patch pour Freevo afin que celui-ci se comporte de manière un peu plus pratique.

Supposons que je dispose de répertoires video1/ et video2/, contenant des fichiers vidéo que je désire voir dans l’ordre de leurs répertoires, mais éventuellement en passant du répertoire video1/ au répertoire video2/ et lycée de Versailles.

Freevo ne gère pas ça très bien : il revient dans tous les cas au début du répertoire. J’ai donc bricolé un patch extrêmement laid, probablement pas du tout pythonique, et qui fait probablement des assertions douteuses, pour gérer ça.

Il est développé sur la version 1.8.3 de Freevo, je compte pas franchement le maintenir de façon régulière, mais au cas où, le voilà.

Patch Freevo

Ah, et il gère que les répertoires où ya que des vidéos (ça suffit de faire sauter le code correspondant), et il suppose que c’est MPlayer qui est utilisé pour les vidéos. Bref, c’est probablement un patch utile que pour moi :p