WEX

The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted in tabular form.

Freebase WEX is provided as a set of database tables in TSV format for PostgreSQL, along with tables providing mappings between Wikipedia articles and Freebase topics, and corresponding Freebase Types.

Download

Freebase WEX is provided free of charge for any purpose with regular updates by Metaweb Technologies. It is distributed, like Wikipedia itself, under the terms of version 1.2 of the GNU Free Documentation License or any later version published by the Free Software Foundation.

Latest update: June 23, 2008

FileDownload sizeUncompressed size
All files 7.8 GB55 GB
Setup files 12 KB44 KB
articles 3.9 GB27 GB
sections 3.1 GB23 GB
template_calls 111 MB875 MB
template_values 496 MB4 GB
category_members 50 MB312 MB
redirects 45 MB135 MB
freebase_names 46 MB247 MB
freebase_wpid 28 MB195 MB
freebase_types 8.3 MB128 MB

Note that due to the large size of Freebase WEX, each data file within the All files tar archive is compressed individually, so that an individual table may be more easily extracted on systems with limited disk space.

Documentation

See here for complete documentation.

Contact

Questions and comments about Freebase WEX should be directed to the Freebase Developer Email List.

Citing

If you'd like to cite WEX in a publication, you may use:

Or as BibTeX:

@misc{metaweb:wex,
  title = "Freebase Wikipedia Extraction (WEX)",
  author = "Metaweb Technologies",
  howpublished = "\url{http://download.freebase.com/wex/}",
  edition = "June 23, 2008",
  year = "2008"
}
  • DBpedia, "a community effort to extract structured information from Wikipedia and to make this information available on the Web", http://dbpedia.org
  • Hugo Zaragoza, Jordi Atserias, Massimiliano Ciaramita and Giuseppe Attardi (Yahoo! Research Barcelona), Semantically Annotated Snapshot of the English Wikipedia, http://www.yr-bcn.es/semanticWikipedia, 2007.