Datahub Enterprise byWe build solutions that unleash the potential of data🚀Let's start with yours!Learn more about us »

Blog Posts

Generate an interactive webpage from CSV data and markdown

Rising Odegua

Getting Started

This tutorial will help you get started with the Datahub pages, and how you can use it to create data driven webpages from static data.

Data pages you will create ca...

Read more

A Short Case Study Involving Table Schema Frictionless Specs at the European Union

sebastien.lavoie

The Frictionless specifications are helping with simplifying data validation for applications in production at the European Union. More specifically, Read more

COVID-19 and Compartmental Models in Epidemiology

rufuspollock

The severity of the current SARS-CoV-2 epidemic is undeniable: since the latest months of 2019, the COVID-19 outbreak is having a significant impact in the world at the macro level, starting its sp...

Read more

Open Data Day 2020 and COVID-19 data

michael.polidori

Here at DataHub and Datopian, we recently celebrated Open Data Day 2020. If you're not familiar with Open Data Day, it...

Read more

Comparotron: A simple way to visualize and share comparisons

rufuspollock

Comparotron allows users to quickly create simple comparative visualizations.

There are already many graphing and apps out there so what's different about comparotron?

The essence of ...

Read more

New Machine Learning Datasets

branko-dj

We are happy to present new datasets extracted from open-ml website. You can find them at our machine-learning datasets Read more

Automatically updated core datasets on DataHub

anuveyatsu

Check out a list of core datasets that are updated on a regular basis. From financial to reference data - it is the best place to find a wide range of up to date datasets.

Financial data

Read more

Sports data on DataHub

anuveyatsu

Great news! We've expanded our range of datasets to include sports data. You can find football data that includes all the major European leagues and ATP tennis data. The football datasets are updat...

Read more

Attribute Relation File Format (ARFF)

branko-dj

We are happy to present a short description of ARFF format that is very useful for those interested in machine learning. In this post we shall explain some features of this format.

What is ...

Read more

How to use multiple DataHub accounts

anuveyatsu

If you are using data CLI tool for both personal and professional purposes, you would need to have more than 1 account. Below we explain how account configurations work and how you can...

Read more

World Bank Indicators on DataHub

branko-dj

We have extracted 307 indicators from The World Bank and published them on DataHub:

https://datahub.io/world-bank

The World Bank Open Data website offers free access to comprehensive,...

Read more

Automated KPIs collection and visualization of the funnels

branko-dj

As a platform dedicated to providing access to high quality data and tooling we need to measure how useful our users find DataHub's services. Measurable values like how many users we have, site tra...

Read more

Revamped awesome collections: data sets that are grouped by subject

anuveyatsu

Awesome pages are collections of data from DataHub and the web that are grouped and analyzed for your usage. Our goal is to cover all important subjects and users always can s...

Read more

Machine learning datasets

svetozarstojkovic, branko-dj

We have created a number of machine learning datasets that can be interesting for professionals and students from the field.

You can see our current machine-learning datasets at...

Read more

Auto-publish your datasets using Travis-CI

anuveyatsu

In this tutorial, we provide instructions on how to automate publishing your dataset via Travis-CI. If you prefer hosting and controlling your dataset on GitHub...

Read more

JavaScript SDK for data deployment

acckiygerman, anuveyatsu

Here we explain how you can use JavaScript SDK for data deployment purposes. If you need a detailed step-by-step tutorial, please, go to this article:

https://datahub.io/docs/tutorials/js-sd...

Read more

How to initialize a data package using data tool

anuveyatsu

In this article we explain how easy is adding a datapackage.json file for your data. You need to have data tool installed - download ...

Read more

Validate your Data Package descriptor online

acckiygerman

To help users with creation of Data Packages we have implemented a descriptor validation tool:

https://datahub.io/tools/validate

Now users can check the Data Package descriptor to be ...

Read more

Q1 2018 Review

anuveyatsu, rufuspollock, akariv

We're sharing an update on all the progress we made in the first quarter of 2018. We massively improved our data command line tool, sped up data deployment 5-100x and introduced embedd...

Read more

New Features and Improvements

acckiygerman

Good day, dear data miners, scientists and statisticians!

During the last month we were focused on polishing the existing product - DataHub platform and the data-cli tool. A...

Read more

Improved Reporting and Debugging of Data Publishing

anuveyatsu

We've integrated our pipelines system with the website to display more insights to our users. Any dataset you publish on DataHub could be in one of three states: processing,

Read more

Data Validation in the DataHub

rufuspollock, anuveyatsu

Users can now use the DataHub to validate their tabular data, for example checking that dates really are dates or that a column of daily revenue is always positive.

Data validation is also i...

Read more

Which country spends the most on pharmaceutical drugs?

mikanebu

There are several graphs that illustrate pharmaceutical drug spendings from the list OECD countries. Data is clean and available in several formats such as csv, json, zip.

Pharm...

Read more

Introducing private datasets on the DataHub

anuveyatsu, rufuspollock

Today we are releasing support for private datasets on the DataHub. Private datasets are exactly that: private and visible and accessible only to their owners.

This feature ...

Read more

Data desktop app - alpha release with drag and drop data publishing support

anuveyatsu

We are pleased to announce the launch of our new desktop application for DataHub users. The app brings drag and drop publishing of data. In addition, users can preview and validate their data prior...

Read more

How to use Data Packages from R

mikanebu, anuveyatsu

This tutorial demonstrates how to use Data Packages from R. We assume that you already know about Data Packages and its Read more

Import online data files directly with scheduling

anuveyatsu, rufuspollock

Users can now import online data files directly into the DataHub using the data command line tool -- and setup scheduled re-imports at the same time.

We're very excited about th...

Read more

Core Data: Essential Datasets for Data Wranglers and Data Scientists

rufuspollock, mikanebu, anuveyatsu

The "Core Data" project provides essential data for the data wranglers and data science community. Its online home is on the DataHub:

https://datahub.io/core

https://datahub.io/docs/c...

Read more

See events and activity related to datasets or publishers

anuveyatsu

You can now see publisher and dataset related events. As we are tracking processes happening in our system, users have ability to discover which publishers have been active or datasets are updated ...

Read more

Datasets in zip format

anuveyatsu

We are now generating compressed versions of datasets so users can download a dataset as a single file. You can find it in the “Data Files” table in the showcase page. For example, you can have a l...

Read more

Previews for large datasets

anuveyatsu

We are now generating preview versions of large datasets so your web browser does not crash by loading large amount of data. The preview versions consist of first 5k rows of datasets (if a dataset ...

Read more

Vega views upgrade - now using v3

anuveyatsu

As you know publishers can create various views using Vega visualizations in DataHub (learn more about views here). We have just upgraded our platform to use Vega...

Read more

Excel Files on the DataHub: Automated Previews and Data Extraction

anuveyatsu

In this tutorial, we will explain how to push Excel data to the DataHub. When an Excel file is pushed, we can extract data from selected sheets for previewing and downloading in alternative formats...

Read more

Data Package v1 Specifications. What has Changed and how to Upgrade

mikanebu

This post walks you through the major changes in the Data Package v1 specs compared to pre-v1. It covers changes in the full suite of Data Package specifications including Data Resources and Table ...

Read more