Dmitry Shpika
// Hacking HTML since 2005

About me

My profile photo

I code for a living and as a hobby, mostly web applications and libraries in TypeScript. Starting from my teenage years, I tried programming languages from Assembly to Coq, and have written and read tons of code, in projects ranging from hello-worlds to cloud-based apps with millions of users. My other interests include natural languages, science, sci-fi, cyber-security, and art (I can draw some fancy pixel-art).

You can find some of my pet projects on GitHub:

  • jmdict-simplified is a converted version of JMdict, JMnedict, Kanjidic, and Kradfile/Radkfile in a user-friendly JSON format (see documentation), created out of frustration after struggling with the original XML files.
  • Kanji Frequency is a unique dataset of usage data of Japanese kanji characters, maintained since 2015. Originally created for my Japanese dream app - which never happened - it includes data from multiple corpora, covering most styles (e.g. academic writing, or fiction books) of Japanese language.
  • is-han is a small Node.js library for properly detecting Han characters (Kanji, Hanzi, Hanja). Because, apparently, it's not as simple as just using a regex with two code points.

I have a degree in IT security, and even though I've never worked as a security engineer, SQL injections are things I know about and look for when I write or review code. I even made a small research project about timing attacks in Node.js.

You can also find me on Tatoeba.org, where I study languages and contribute sentences.

Posts