WEX

The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted in tabular form. Freebase WEX is provided as a set of database tables in TSV format for PostgreSQL, along with tables providing mappings between Wikipedia articles and Freebase topics, and corresponding Freebase Types.

Download

Freebase WEX is provided free of charge for any purpose with regular updates by Metaweb Technologies. It is distributed, like Wikipedia itself, under the terms of version 1.2 of the GNU Free Documentation License or any later version published by the Free Software Foundation.

Latest update: December 25, 2009
File Download size Uncompressed size
All files 12G 81G
Setup files 12K 44K
articles 5.6G 39G
sections 4.6G 39G
template_calls 208M 1.6G
template_values 997M 7.7G
category_members 89M 467M
redirects 57M 188M
freebase_names 64M 323M
freebase_wpid 39M254M
freebase_types 300M 300M

Note that due to the large size of Freebase WEX, each data file within the All files tar archive is compressed individually, so that an individual table may be more easily extracted on systems with limited disk space.

Documentation

See here for complete documentation.

Contact

Questions and comments about Freebase WEX should be directed to the Freebase Developer Email List.

Citing

If you'd like to cite WEX in a publication, you may use:

Or as BibTeX:

@misc{metaweb:wex,
  title = "Freebase Wikipedia Extraction (WEX)",
  author = "Metaweb Technologies",
  howpublished = "\url{http://download.freebase.com/wex/}",
  edition = "December 25, 2009",
  year = "2009"
}
  • DBpedia, "a community effort to extract structured information from Wikipedia and to make this information available on the Web", http://dbpedia.org
  • Hugo Zaragoza, Jordi Atserias, Massimiliano Ciaramita and Giuseppe Attardi (Yahoo! Research Barcelona), Semantically Annotated Snapshot of the English Wikipedia, http://www.yr-bcn.es/semanticWikipedia, 2007.

Join the developers email list

Discuss the API, MQL, Acre, and everything else related to developing applications with Freebase. Search or browse the archives.

Join the data-modelers email list

Discuss new type schemas, the Commons, and everything else related to the structuring of data on Freebase. Search or browse the archives.

Take me to Freebase.com