Filipino Guide to Language Technology
Book Concept
Target Audience is undergrad IT student looking to do a school project related to human language. Start with a chapter related to applying NLTK to Cebuano, including the Wolff dictionary. Secondary target is a IT-power user language specialist.
Put up draft chapters on Linguistic Exploration, or later a dedicated website.
...
Outline
- Intro stuff, what is HLT, diversity of Philippine languages, language situation in Phil, disciplines of informatics and linguistics
- Using HLT on the Web, identifying the gap for Phil languages, focus on three problems motivating running examples
- Using NLTK
- Concepts about Language: Phonemes and Morphemes
- Concepts about Software: data representation, XML
- e.g. TEI dictionaries, Wolff dictionary
- Lang: Syntax at level of parse trees
- using NLTK for a grammar of Filipino
- Modeling syntax
- FSM, push-down automata, recursive desent parser
- formal grammars of Filipino and Cebuano
- Corpus building
- Using NLTK, building a database/Web repository
- Using semantics
- Semantic Web
- Word senses in a TEI dictionary
- Modifying NLTK
- Python or Java
- Typed Feature Structures
- HPSG models of Filipino and Cebuano
- Using LKB
- Semantics of Words
- FrameNet
- Modifying NLTK for word semantics of a Philippine language
- Semantics of Clauses
- MRS
- Discourse
- DRT
- Speech Acts
- Comparing Languages
- Extended example
- Cognitive Modeling
- Open projects
- Collaborating with language specialists
- Publishing on the Web
- Open Source
Biblio
- NLTK book
- SIL software
- HPSG
- Kim & Sells 2008
- Sag, Wasow and Bender
- LKB
- ....
No comments:
Post a Comment