What is a Time-Series?
A time series is a series of data points indexed (or listed or sorted) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. For example:
- Weekly sales for each store.
- Ambient temperature recorded at a periodic (e.g., every minute, hour, etc.) interval.
- Machine vibrations recorded at an hourly level.
- Hourly Network traffic, etc.
You typically need two columns for any time-series: a time-column, and a value-column. So for example, when we talk about ambient temperature, the two columns would be time-stamp and temperature recorded at that time-stamp. Several times your data might have several time-series together. For example, the above data might also include location. What you are looking at then, is multiple time-series put together. Daisho is completely capable of handling such scenarios.
Time Series Analysis. And When to Use.
The typical questions one might seek to answer in time-series analysis are:
- Are there any segments/clusters which have similar behaviour? Many a time, you tend to have time-series data for several units, e.g., machines. You want to know machines have similar yield degradation over time. Effectively, you are plotting yield for each machine over time, and you want to know which lines look similar (have similar shapes).
- Have there been times in the past when anomalies occurred? And have anomalies occurred across dimensions at the same time?
- Time-based trends/Leads/Lags: How do actions in the past impact results in the present? Oftentimes, actions (e.g., pricing changes) tend to show their full impact (e.g., impact on sales) with a certain lag.
The key to time-series analysis is the dependence on time. The dependence might come in one or both of the following ways:
- What’s happening at time (t) is dependent on what happened at time (t-1). For example, an upgrade to machinery yesterday (t-1) led to an increase in production today (t). Or, yield has been falling for the past few weeks (time till t-1), so we can expect it to happen today (t) too.
- There’s an inherent pattern in data which is time-driven. For example, ambient temperature follows a definite seasonal pattern.
Time-Series analysis is typically used when you expect either of the above conditions. If neither is true, you might be better off with standard analysis - drivers, clusters, etc. as the case may be.
What is the data and information required?
The only MUST-HAVE in your data is the time variable. This is all the information you need to give Daisho to kick off time-series analysis.
Daisho automatically builds time-series analysis for ALL columns in your data using the time variable you identified. You can - at any point in your analysis - decide to rollup the data. The rollup dimensions, of course, are limited to the columns that are present in your data!
How Daisho Does Time-Series Analysis. And What You Can Do Today.
The most important step to time series analysis is to tease out relationship with time. Daisho automatically does this for any kind of data. Daisho tries out many different kinds of patterns - one of which is NO relationship with time - and picks out the relationship that fits data the best. This is completely automated, and is the starting point for ANY time-series analysis on Daisho.
Data Resolution vs. Analysis Resolution
Oftentimes, data is collected at a certain resolution (e.g., hourly/daily), while you might want to run your analysis at a different resolution (e.g., weekly/monthly). Daisho lets you do that - you just need to select your analysis resolution, and Daisho automatically rolls-up your data accordingly.
You might also want to aggregate data based on other dimensions. For example, you might have sales data at a store level, but you want to run your analysis at a district or state level. Again, you can set the aggregation rollup, and Daisho automatically does the rest.
Data Quality
As with every recipe on Daisho, you get an input data quality report automatically. You can see stats for each time-series, as well as time-periods where data is missing.
Analysis
Daisho currently supports several uni-variate analysis:
- Automatic trend detection after adjusting for seasonality.
- Anomaly detection after adjusting for trends and seasonality.
- Auto-discovered clusters/segments
In addition, you can also do multi-variate outlier analysis. We have Driver Analysis coming soon as well!