Carnegie Mellon University Language Technologies Institute (GHC 5719) 5000 Forbes Ave Pittsburgh, PA 15213 Advisor: Noah Smith Research group: Noah's Ark Contact
email | +1 (217) 703-4454 Publications (BibTeX) A Probabilistic Model for Canonicalizing Named Entity Mentions [PDF] Dani Yogatama, Yanchuan Sim, Noah A. Smith Annual Meeting of the Association for Computational Linguistics (ACL 2012). Jul, 2012. Jeju, Korea.Discovering Factions in the Computational Linguistics Community [PDF] Yanchuan Sim, Noah A. Smith, David A. Smith Annual Meeting of the Association for Computational Linguistics (ACL 2012) Rediscovering 50 Years of Discoveries Workshop. Jeju, Korea. Jul, 2012.Wei Zhang, Yanchuan Sim, Jian Su, Chew Lim Tan International Joint Conferences on Artificial Intelligence (IJCAI 2011). Barcelona, Spain. Jul, 2011. Wei Zhang, Yanchuan Sim, Jian Su, Chew Lim Tan Text Analysis Conference (TAC 2010). Gaithersburg, MD, USA. Nov, 2010. Projects Yanchuan Sim (2010) CS 598 Advanced NLP Final Project: Learning a Factorial HMM for Joint Sequence Labeling [PDF] [Poster]
Yanchuan Sim (2010) Yanchuan Sim (2008) 2008 Summer attachment at Institute for Infocomms Research Code An assortment of tools and code that I use for my projects. Ark-SAGE is a Java library that implements the L1-regularized version of Sparse Additive GenerativE models of Text (Einsenstein et al, 2011). SAGE is an algorithm for learning sparse representations of text (you can read more about it here). A growing collection of handy utility modules for NLP with Python (mainly data processing related). |
