Archived documentation version rendered and hosted by DevNetExpertTraining.com
Documentation

Sample data

Use sample data to familiarize yourself with time series data and InfluxDB Cloud. Sample datasets in InfluxDB Cloud let you access time series data without having to write data to InfluxDB. Sample datasets are available for download and can be written to InfluxDB or loaded at query time.

The sample data below contains both static and live datasets. A static sample dataset is not updated regularly and has fixed timestamps. A “live” sample dataset is updated regularly.

If writing a static sample dataset to a bucket with a limited retention period, use sample.alignToNow() to shift timestamps to align the last point in the set to now. This will prevent writing points with timestamps beyond the bucket’s retention period.

Sample datasets

Do one of the following:

Air sensor sample data

Size: ~600 KB • Updated: every 15m

Air sensor sample data represents an “Internet of Things” (IoT) use case by simulating temperature, humidity, and carbon monoxide levels for multiple rooms in a building.

To download and output the air sensor sample dataset, use the sample.data() function.

import "influxdata/influxdb/sample"

sample.data(set: "airSensor")

Companion SQL sensor data

The air sensor sample dataset is paired with a relational SQL dataset with meta information about sensors in each room. These two sample datasets are used to demonstrate how to join time series data and relational data with Flux in the Query SQL data sources guide.

Download SQL air sensor data

Bird migration sample data

Size: ~1.2 MB • Updated: N/A

Bird migration sample data is adapted from the Movebank: Animal Tracking data set and represents animal migratory movements throughout 2019.

To download and output the bird migration sample dataset, use the sample.data() function.

import "influxdata/influxdb/sample"

sample.data(set: "birdMigration")

The bird migration sample dataset is used in the Work with geo-temporal data guide to demonstrate how to query and analyze geo-temporal data.

NOAA sample data

The following two National Oceanic and Atmospheric Administration (NOAA) datasets are available to use with InfluxDB.

NOAA NDBC data

Size: ~1.3 MB • Updated: every 15m

The NOAA National Data Buoy Center (NDBC) dataset provides observations (updated every 15 minutes) from the NOAA NDBC network of buoys throughout the world.

To download and output the most recent NOAA NDBC observations, use the sample.data() function.

import "influxdata/influxdb/sample"

sample.data(set: "noaa")
Store historical NOAA NDBC data

The NOAA NDBC sample dataset only returns the most recent observations; not historical observations. To regularly query and store NOAA NDBC observations, add the following as an InfluxDB task. Replace example-org and example-bucket with your organization name and the name of the bucket to store data in.

import "influxdata/influxdb/sample"

option task = {
  name: "Collect NOAA NDBC data"
  every: 15m,
}

sample.data(set: "noaa")
  |> to(
      org: "example-org",
      bucket: "example-bucket"
  )

NOAA water sample data

Size: ~10 MB • Updated: N/A

The NOAA water sample dataset is static dataset extracted from NOAA Center for Operational Oceanographic Products and Services data. The sample dataset includes 15,258 observations of water levels (ft) collected every six minutes at two stations (Santa Monica, CA (ID 9410840) and Coyote Creek, CA (ID 9414575)) over the period from August 18, 2015 through September 18, 2015.

Store NOAA water sample data to avoid bandwidth usage

To avoid having to re-download this 10MB dataset every time you run a query, we recommend that you create a new bucket (noaa) and write the NOAA data to it. We also recommend updating the timestamps of the data to be relative to now(). To do so, run the following:

import "experimental/csv"

relativeToNow = (tables=<-) => tables
    |> elapsed()
    |> sort(columns: ["_time"], desc: true)
    |> cumulativeSum(columns: ["elapsed"])
    |> map(fn: (r) => ({r with _time: time(v: int(v: now()) - r.elapsed * 1000000000)}))

csv.from(url: "https://influx-testdata.s3.amazonaws.com/noaa.csv")
    |> relativeToNow()
    |> to(bucket: "noaa", org: "example-org")

The NOAA water sample dataset is used to demonstrate Flux queries in the Common queries and Common tasks guides.

USGS Earthquake data

Size: ~6 MB • Updated: every 15m

The United States Geological Survey (USGS) earthquake dataset contains observations collected from USGS seismic sensors around the world over the last week. Data is updated approximately every 15m.

To download and output the last week of USGS seismic data, use the sample.data() function.

import "influxdata/influxdb/sample"

sample.data(set: "usgs")

Write sample data with an InfluxDB task

Use the Flux InfluxDB sample package to download and write sample data to InfluxDB.

Add the following as an InfluxDB task.

import "influxdata/influxdb/sample"

option task = {name: "Collect NOAA NDBC data", every: 15m}

sample.data(set: "noaa")
    |> to(bucket: "noaa")

Select your region

Upgrade to InfluxDB Cloud or InfluxDB 2.0!

InfluxDB Cloud and InfluxDB OSS 2.0 ready for production.