Reference Data

Easy-to-use reference data in CSV and JSON formats.

This page is the place where you can find all reference data available on DataHub. Each dataset is easy-to-use and includes instructions on how to use it in different tools and programming languages.

What is reference data? Here is the quote from Wikipedia that answers that question:

Reference data are data that define the set of permissible values to be used by other data fields. Reference data gain in value when they are widely re-used and widely referenced. Typically, they do not change overly much in terms of definition, apart from occasional revisions. Reference data are often defined by standards organizations, such as country codes as defined in ISO 3166-1… read full article

In this section, we have included the datasets that are most popular among our users.

List of countries

ISO 3166-1-alpha-2 English country names and code elements. This list states the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements:

core/country-list

S&P 500 companies

List of companies in the S&P 500 (Standard and Poor’s 500). The S&P 500 is a free-float, capitalization-weighted index of the top 500 publicly listed stocks in the US (top 500 by market cap). The dataset includes a list of all the stocks contained therein:

core/s-and-p-500-companies

Language codes

Comprehensive language code information, consisting of ISO 639-1, ISO 639-2 and IETF language types:

core/language-codes

World cities

List of major cities in the world that have above 15,000 inhabitants. Each city is associated with its country and subcountry to reduce the number of ambiguities. Subcountry can be the name of a state (eg in United Kingdom or the United States of America) or the major administrative section (eg ‘‘region’’ in France’’):

core/world-cities

Other useful reference data

Airport codes

Airport codes from around the world:

core/airport-codes

Country codes

Comprehensive country code information, including ISO 3166 codes, ITU dialing codes, ISO 4217 currency codes, and many others:

core/country-codes

Currency codes

List of currencies and their 3 digit codes as defined by ISO 4217. The data provided here is the consolidation of Table A.1 “Current currency & funds code list” and Table A.3 “Historic denominations”:

core/currency-codes

Continent codes

A list of the seven continents with English names and short, unique and permanent identifying codes:

core/continent-codes

SMDG Master Terminal Facilities List

List mantained by the SMDG Secretariat to specify the port terminal facilities in UN/EDIFACT messages. The list is directly linked with the UN/LOCODE codelist:

core/smdg-master-terminal-facilities-list

IMO IMDG Classification Codes

Official IMDG Codes for use in transport of dangerous goods as described by the IMO:

core/imo-imdg-codes

IPv4 geolocation

Database of IPv4 address networks with their respective geographical location:

core/geoip2-ipv4

Top Level Domain Names

This Data Package contains the delegation details of top-level domains:

core/top-level-domain-names

Classification of the Functions of Government

Classification of the Functions of Government (COFOG) is a classification defined by the United Nations Statistics Division:

core/cofog

UNECE/CEFACT package codes

Coded representations of the package type names used in International Trade (UNECE/CEFACT Trade Facilitation Recommendation No.21):

core/unece-package-codes

UNECE Units of measure

Standardised codes from Recommendation 20, mantained by UNECE:

core/unece-units-of-measure

UN Locode

The United Nations Code for Trade and Transport Locations is a code list maintained by UNECE, United Nations agency, to facilitate trade:

core/un-locode

This data provides the details of the membership by states to WIPO administered treaties on the subject matter of copyright:

core/membership-to-copyright-treaties

List of FIPS

List of FIPS (Federal Information Processing Standards) region codes:

core/fips-10-4

Media Types

This dataset lists all the Media Types (MIME types), Media Subtypes, and their file extensions:

core/media-types

ISO 6346 Container Type Codes

Coded list of ISO 6346 shipping containers, used in international trade and electronic shipping messages:

core/iso-container-codes

International Chamber of Commerce Incoterms

International Commercial Terms (‘Incoterms’) are internationally recognised standard trade terms used in sales contracts:

core/icc-incoterms

DAC and CRS code lists

The DAC Secretariat maintains various code lists which are used by donors to report on their aid flows to the DAC databases. In addition, these codes are used to classify information in the DAC databases:

core/dac-and-crs-code-lists

UK Condensed SIC 2007

UK condensed standard industrial classification of economic activities (SIC) 2007 codes:

core/uk-sic-2007-condensed

S&P 500 Companies with Financial Information

List of companies in the S&P 500 (Standard and Poor’s 500). The dataset includes a list of all the stocks contained therein and associated key financials such as price, market capitalization, earnings, price/earnings ratio, price to book etc.:

core/s-and-p-500-companies-financials

NYSE and Other Listings

List of companies in the NYSE, and other exchanges:

core/nyse-other-listings

NASDAQ listings

List of companies in the NASDAQ exchanges:

core/nasdaq-listings