I am a PhD student on the INCEpTION project where I am working on interactive annotation for NLP. Currently I research how several annotation tasks, e.g. creating corpora with linked entities or predicate-argument structures can be supported by interactive machine learning. In my free time, I love to learn Mandarin.
PhD in Computer Science, 2018 - now
MSc in Computer Science, 2017
BSc in Computer Science, 2014
After finishing Probability and Statistics by Stanford Online, I wanted to improve my knowledge of statistics further. The logical choice was to take Probability - The Science of Uncertainty and Data MITx (6.431x) which was starting at the time. While there is a MIT OCW version of the course, I decided to use the edX version, as the videos are shorter, it is more structured, there are tight deadlines instead of self study and as the course was starting, there were many other students participating at the same time.
For seqviz and INCEpTION external recommender, I recently had the problems that I needed to use BERT for sequence tagging but needed to predict labels for my own tokens. The problem is that transformers typically use subword tokenization like WordPiece or Sentence piece. That means, one of my tokens potentially has more than one transformer token. The trick to align the two tokenizations is the following: one does not ask the transformer tokenizer to tokenize the whole sentence, but let the tokenizer tokenize each of your tokens.
I recently got interested in economics and finance. A post on Hacker News was recommending to take Economics of Money and Banking on Coursera as an introduction. I really enjoyed the course and will give a brief review of it here. Content From the course description: The last three or four decades have seen a remarkable evolution in the institutions that comprise the modern monetary system. The financial crisis of 2007-2009 is a wakeup call that we need a similar evolution in the analytical apparatus and theories that we use to understand that system.
In order to brush up my math skills, I started the Probability and Statistics by Stanford Online. I found it via the Free online machine learning curriculum by Chip Hyuen. If you do not know her, make sure to read her blog and books! Sadly, the online platform that hosted the course will shut down. Hopefully, the course is hosted somewhere else, e.g. Open edX . I will update this post once I know more!
Introduction In October, I participated together with a friend in the 2019 International Collegiate Competition for Brain-inspired Computing organized by Tsinghua University. The task was to develop an AI system that is brain-inspired, that is inspired by how the human brain works. From the website: In this contest, we generalize brain-inspired computing a little bit to a wider range of works reflecting the integration of neuroscience and computer science on
INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.
seqviz is a Python package to visualize sequence tagging results. It can be either be used to print to console or in Jupyter Notebooks.