Discovering Finance Keywords via Continuous Space Language Models

Ming-Feng Tsai1, Chuan-Ju Wang2, and Po-Chuan Chien1
(1) Department of Computer Science, National Chengchi University, Taipei 116, Taiwan
(2) Research Center for Information Technology Innovation, Academia Sinica, Taipei 115, Taiwan


With the growing amount of public financial data, this makes it more and more important to learn how to discover valuable information for financial decision-making. This paper proposes an approach to discovering financial keywords from a large number of financial reports. In particular, we apply the continuous bag-of-words (CBOW) model, a well-known continuous space language model, to the textual information of 10-K financial reports to find out new finance keywords. In order to capture word meanings for better locating financial terms, we also present a novel technique to incorporate syntactic information into the CBOW model. Experimental results on two prediction tasks of using the discovered keywords demonstrate that our approach is effective for discovering predictability keywords for the stock volatility and abnormal trading volume predictions. Furthermore, we also provide analyses of the discovered keywords which attest the ability of the proposed method to capture both syntactic and contextual information between words; this shows the success of this method when applied to the field of Finance.

JJ JJ (adjective)
JJR (adjective, comparative)
JJS (adjective, superlative)
NN NN (noun, singular or mass)
NNS (noun, plural)
NNP (proper noun, singular)
NNPS (proper noun, plural)
PRP PRP (personal pronoun)
PRP$ (possessive pronoun)
RB RB (adverb)
RBR (adverb, comparative)
RBS (adverb, superlative)
VB VB (verb, base form)
VBD (verb, past tense)
VBG (verb, gerund or present participle)
VBN (verb, past participle)
VBP (verb, non-3rd person singular present)
VBZ (verb, 3rd person singular present)
WP WP (wh-pronoun)
WP$ (possessive wh-pronoun)