WEX
The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted in tabular form. Freebase WEX is provided as a set of database tables in TSV format for PostgreSQL, along with tables providing mappings between Wikipedia articles and Freebase topics, and corresponding Freebase Types.
Download
Freebase WEX is provided free of charge for any purpose with regular updates by Metaweb Technologies. It is distributed, like Wikipedia itself, under the terms of version 1.2 of the GNU Free Documentation License or any later version published by the Free Software Foundation.
| File | Download size | Uncompressed size |
|---|---|---|
| All files † | 12G | 81G |
| Setup files | 12K | 44K |
| articles | 5.6G | 39G |
| sections | 4.6G | 39G |
| template_calls | 208M | 1.6G |
| template_values | 997M | 7.7G |
| category_members | 89M | 467M |
| redirects | 57M | 188M |
| freebase_names | 64M | 323M |
| freebase_wpid | 39M | 254M |
| freebase_types | 300M | 300M |
† Note that due to the large size of Freebase WEX, each data file within the All files tar archive is compressed individually, so that an individual table may be more easily extracted on systems with limited disk space.
Documentation
See here for complete documentation.
Contact
Questions and comments about Freebase WEX should be directed to the Freebase Developer Email List.
Citing
If you'd like to cite WEX in a publication, you may use:
- Metaweb Technologies, Freebase Wikipedia Extraction (WEX), http://download.freebase.com/wex/, December 25, 2009
Or as BibTeX:
@misc{metaweb:wex,
title = "Freebase Wikipedia Extraction (WEX)",
author = "Metaweb Technologies",
howpublished = "\url{http://download.freebase.com/wex/}",
edition = "December 25, 2009",
year = "2009"
}
Related Work
- DBpedia, "a community effort to extract structured information from Wikipedia and to make this information available on the Web", http://dbpedia.org
- Hugo Zaragoza, Jordi Atserias, Massimiliano Ciaramita and Giuseppe Attardi (Yahoo! Research Barcelona), Semantically Annotated Snapshot of the English Wikipedia, http://www.yr-bcn.es/semanticWikipedia, 2007.
Join the developers email list
Discuss the API, MQL, Acre, and everything else related to developing applications with Freebase. Search or browse the archives.
Join the data-modelers email list
Discuss new type schemas, the Commons, and everything else related to the structuring of data on Freebase. Search or browse the archives.