Content from 2017-06

(move 'picocms 'jekyll)

posted on 2017-06-04

I happened to think what if I die, who will pay for my hosting. So today I migrate my main website and blog posts from PicoCMS hosted at Scaleway to Jekyll hosted at Github. What I cannot pay for domain name, my site can be still accessed by veer66.github.io

PicoCMS and Jekyll are based on Markdown so I just wrote a script for renaming my blog post file name and modifing some metadata by the shell script below:

Sh for x in *.md do T=`head -n4 $x | grep '^Title:' | sed 's/Title: //' | sed 's/[ "\|?\/\(\)]/-/g'` D=`head -n4 $x | grep '^Date' | sed 's/Date: //' | sed 's/\//-/g'` mv $x $D-$T.md done for x in *.md do cat $x | sed 's/\/\*//' | sed 's/\*\///' | sed 's/Title: /# /' > t && mv t $x done

If I outlive Github, I can just generate this site and host it somewhere else.

A benchmark of Thai word tokenizers written in various programming languages

posted on 2017-06-04

The origin post was at https://veer66.wordpress.com/2017/01/19/benchmark-thai-word-tokenizers/ posted on 2017/01/19.

I wonder about speed of programs written in different languages. For example, I wonder whether one written in Kotlin and ran on JVM is slower than one written in Go. Although there are several existing benchmarks, this is one may be still important at least for me, because Thai word tokenizer is my real task.

So @iporsut and me wrote some programs in different programming languages and tried to optimize them.

I conducted the experiment on my laptop computer, which has Intel® Core™ i3-4030U CPU @ 1.90GHz × 4, on a 20MB Thai text corpus.

  • Rust #1: 3.366 #2: 3.247 #3 3.241 #Avg: 3.284
  • Go #1: 5.415 #2: 5.405 #3 5.416 #Avg: 5.412
  • Crystal #1: 5.637 #2: 5.679 #3 5.649 #Avg: 5.655
  • Kotlin+Clojure #1: 6.547 #2: 6.743 #3 6.628 #Avg: 6.639
  • Julia #1: 38.316 #2: 38.112 #3 38.237 #Avg: 38.221
  • JavaScript #1: 49.349 #2: 49.084 #3 49.901 #Avg: 49.445
  • Python #1: 50.624 #2: 50.803 #3 50.869 #Avg: 50.765
  • Clojure+Kotlin #1: 63.502 #2: 67.561 #3 67.303 #Avg: 66.122

Additional setup

Future work

@iporsut has already written multicore versions, so maybe next month I will conduct another experiment.

This blog covers Blog, Coleslaw


Unless otherwise credited all material Creative Commons License by Vee Satayamas